Skip to content

Commit

Permalink
feat: schema validation
Browse files Browse the repository at this point in the history
BREAKING CHANGE: schema validation is mandatory; error is thrown if schema is invalid
  • Loading branch information
kpietraszko committed Dec 31, 2024
1 parent 27bcf55 commit d52f3f2
Show file tree
Hide file tree
Showing 5 changed files with 237 additions and 34 deletions.
73 changes: 64 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,57 @@
# schemind
[![NPM Version](https://img.shields.io/npm/v/schemind?link=https%3A%2F%2Fwww.npmjs.com%2Fpackage%2Fschemind)](https://www.npmjs.com/package/schemind)
![Code Coverage](https://raw.githubusercontent.com/kpietraszko/schemind/refs/heads/main/badge.svg)
[![Brotli Size](https://deno.bundlejs.com/badge?q=schemind&treeshake=[*]&config={%22compression%22:%22brotli%22})](https://bundlejs.com/?q=schemind&treeshake=%5B*%5D&config=%7B%22compression%22%3A%22brotli%22%7D)

# schemind
Read and write to messages serialized as arrays (aka indexed keys) by defining a schema.
Read and write to messages serialized as arrays (aka indexed keys messages) by defining a schema. Protocol‑agnostic.

## What?
TODO
In formats like JSON, a message normally looks something like this:
```json
{
"id": 1,
"fullName": "John Doe",
"email": "[email protected]",
"birthDate": "1973-01-22",
"address": {
"street": "123 Main Street",
"city": "Anytown",
"zipcode": "12345-6789",
"geo": {
"lat": 42.1234,
"lng": -71.2345
}
},
"website": "www.johndoe.com"
}
```
*I'm using JSON as an example here, but schemind is essentially protocol-agnostic. I use it with MessagePack.*

If you desperately need to make this message more compact, you could alternatively serialize it as such:
```json
[
1,
"John Doe",
"[email protected]",
"1973-01-22",
[
"123 Main Street",
"Anytown",
"12345-6789",
[
42.1234,
-71.2345
]
],
"www.johndoe.com"
]
```

This is sometimes referred to as a message with *indexed keys*.

*Note that this obviously has some drawbacks: [recommended reading about the pros and cons of this format](https://github.com/MessagePack-CSharp/MessagePack-CSharp#use-indexed-keys-instead-of-string-keys-contractless).*

**Schemind** helps you create and read such messages, if your (de)serializer doesn't support this technique.

## Installation

Expand All @@ -17,13 +62,23 @@ npm install schemind
## Usage
TODO

## FAQ
TODO
## FAQ
### Shouldn't this be an extension of a serializer?
Probably.

### Wouldn't it be better to use protobuf at this point?
Possibly. But if you're already using JSON / MessagePack / CBOR etc. in your app, and you need more compact messages for some features — *schemind* could be useful.

Additionally, in some languages (backend or frontend) there's a MessagePack or JSON implementation that's faster, or allocates less memory, than protobuf.

### Why is `get` so inconvenient?
The `get` function prioritizes performance over convenience. The main goal here is to avoid any heap allocations (beyond what your deserializer allocates). I use *schemind* in performance-critical scenarios, where avoiding GC pauses is crucial.
Use the `toPlainObject` function instead, if you don't mind some extra allocations.

## Related work
* https://github.com/MessagePack-CSharp/MessagePack-CSharp#use-indexed-keys-instead-of-string-keys-contractless
* https://aarnott.github.io/Nerdbank.MessagePack/docs/customizing-serialization.html?q=indexed#serialize-objects-with-indexes-for-keys
* [MessagePack-CSharp (.NET)](https://github.com/MessagePack-CSharp/MessagePack-CSharp#use-indexed-keys-instead-of-string-keys-contractless)
* [Nerdbank.MessagePack (.NET)](https://aarnott.github.io/Nerdbank.MessagePack/docs/customizing-serialization.html?q=indexed#serialize-objects-with-indexes-for-keys)


* https://github.com/Idein/msgpack-schema
* https://github.com/serde-rs/serde/issues/959
* [Idein/msgpack-schema (Rust)](https://github.com/Idein/msgpack-schema)
* [serde (Rust)](https://github.com/serde-rs/serde/issues/959)
3 changes: 2 additions & 1 deletion src/index.ts
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
export { withIndex, get, set, toPlainObject, toIndexedKeysMessage } from './indexedKeysSchema'
export { validateSchema, withIndex, get, set, toPlainObject, toIndexedKeysMessage, InvalidSchemaError } from './indexedKeysSchema'
export type { ValidIndexedKeysMessageSchema } from './indexedKeysSchema'
111 changes: 91 additions & 20 deletions src/indexedKeysSchema.ts
Original file line number Diff line number Diff line change
@@ -1,20 +1,91 @@
import type { NonNegativeInteger } from "type-fest";

const isSchemaLeafTag = Symbol("isSchemaLeaf");
const isValidSchemaLeaf = Symbol("isValidSchemaLeaf");

type IndexesPath = number[];
type SchemaLeaf<TField> = {
indexesPathReversed: number[],
fieldType: TField
indexesPathReversed: IndexesPath,
fieldType: TField,
[isSchemaLeafTag]: true
};

type IndexedKeysMessageSchema<TSchema> = {
type ValidSchemaLeaf<TField> = SchemaLeaf<TField> & { [isValidSchemaLeaf]: true };

export type IndexedKeysMessageSchema<TSchema> = {
[K in keyof TSchema]: TSchema[K] extends SchemaLeaf<infer TField>
? SchemaLeaf<TField>
: IndexedKeysMessageSchema<TSchema[K]>;
};

type ToValidIndexedKeysMessageSchema<TSchema> = {
[K in keyof TSchema]: TSchema[K] extends SchemaLeaf<infer TField>
? ValidSchemaLeaf<TField>
: ToValidIndexedKeysMessageSchema<TSchema[K]>;
};

const invalid = Symbol('invalid')
type Invalid<T extends string> = { [invalid]: T }

export type ValidIndexedKeysMessageSchema<TSchema> = {
[K in keyof TSchema]: TSchema[K] extends ValidSchemaLeaf<infer TField>
? ValidSchemaLeaf<TField>
: TSchema[K] extends SchemaLeaf<unknown>
? Invalid<"Schema needs to be validated before you use it, did you forget to call validateSchema()?">
: ToValidIndexedKeysMessageSchema<TSchema[K]>;
};


type ReturnedSchemaNode<TField, TNestedSchema> = TNestedSchema extends undefined ?
SchemaLeaf<TField>
: TNestedSchema;

export class InvalidSchemaError extends Error {
constructor() {
super("Invalid schema. Make sure there are no duplicate indexes, and that nested objects are also wrapped with withIndex.");
this.name = "InvalidSchemaError";
}
}

export function validateSchema<TSchema extends IndexedKeysMessageSchema<TSchemaInner>, TSchemaInner>(schema: TSchema) {
validateSchemaRecursively(schema, [], 0);
return schema as unknown as ToValidIndexedKeysMessageSchema<TSchema>;
}

function validateSchemaRecursively(
schemaNode: IndexedKeysMessageSchema<unknown>,
encounteredIndexesPaths: IndexesPath[],
currentTreeLevel: number){

for (const [_, nestedSchemaNode] of Object.entries(schemaNode)) {
const nestedNode = nestedSchemaNode as IndexedKeysMessageSchema<unknown> | SchemaLeaf<unknown>;
if (isSchemaLeaf(nestedNode)) {
validateSchemaLeaf(nestedNode, encounteredIndexesPaths, currentTreeLevel);
} else {
validateSchemaRecursively(nestedNode, encounteredIndexesPaths, currentTreeLevel + 1);
}
}
}

function validateSchemaLeaf(schemaLeaf: SchemaLeaf<unknown>, encounteredIndexesPaths: IndexesPath[], currentTreeLevel: number){
const duplicateIndexesPathDetected = encounteredIndexesPaths.some(encounteredPath =>
encounteredPath.length === schemaLeaf.indexesPathReversed.length &&
encounteredPath.every((pathElement, index) => pathElement === schemaLeaf.indexesPathReversed[index]));

if (duplicateIndexesPathDetected)
{
throw new InvalidSchemaError()
}

encounteredIndexesPaths.push(schemaLeaf.indexesPathReversed);

const indexesPathLengthDoesntMatchLevel = schemaLeaf.indexesPathReversed.length !== (currentTreeLevel + 1);
if (indexesPathLengthDoesntMatchLevel)
{
throw new InvalidSchemaError()
}
}

export function withIndex<const TIndex extends number>(index: NonNegativeInteger<TIndex>) {
return <const TField = undefined, TNestedSchema extends IndexedKeysMessageSchema<TNestedSchema> | undefined = undefined>(nestedSchema?: TNestedSchema)
: ReturnedSchemaNode<TField, TNestedSchema> => {
Expand All @@ -26,23 +97,21 @@ export function withIndex<const TIndex extends number>(index: NonNegativeInteger
return {
indexesPathReversed: [index] as number[],
fieldType: undefined as TField,
[isSchemaLeafTag]: true
} as const as ReturnedSchemaNode<TField, TNestedSchema>;
};
}

// intentionally not validating that it has the "isValidSchemaLeaf" symbol property, because it actually doesn't - it's just a type trick
function isSchemaLeaf(value: IndexedKeysMessageSchema<unknown> | ValidSchemaLeaf<unknown>): value is ValidSchemaLeaf<unknown>;
function isSchemaLeaf(value: IndexedKeysMessageSchema<unknown> | SchemaLeaf<unknown>): value is SchemaLeaf<unknown> {
const propertyOnlyInLeaf = "fieldType" satisfies keyof SchemaLeaf<unknown>;
return Object.hasOwn(value, propertyOnlyInLeaf);
return Object.hasOwn(value, isSchemaLeafTag);
}

function addIndexToPathsRecursively(
schemaNode: IndexedKeysMessageSchema<unknown> | SchemaLeaf<unknown>,
schemaNode: IndexedKeysMessageSchema<unknown>,
indexToAdd: number) {

if (isSchemaLeaf(schemaNode)) {
return;
}

for (const [_, nestedSchemaNode] of Object.entries(schemaNode)) {
const nestedNode = nestedSchemaNode as IndexedKeysMessageSchema<unknown> | SchemaLeaf<unknown>;
if (isSchemaLeaf(nestedNode)) {
Expand All @@ -53,7 +122,7 @@ function addIndexToPathsRecursively(
}
}

export function get<const TField>(message: readonly unknown[], schemaField: SchemaLeaf<TField>) {
export function get<const TField>(message: readonly unknown[], schemaField: ValidSchemaLeaf<TField>) {
const indexesPathReversed = schemaField.indexesPathReversed;
let currentSlice: readonly unknown[] = message;

Expand All @@ -65,7 +134,7 @@ export function get<const TField>(message: readonly unknown[], schemaField: Sche
return currentSlice[lastIndexInPath] as TField;
}

export function set<const TField>(targetMessage: unknown[], schemaField: SchemaLeaf<TField>, value: TField) {
export function set<const TField>(targetMessage: unknown[], schemaField: ValidSchemaLeaf<TField>, value: TField) {
const indexesPathReversed = schemaField.indexesPathReversed;
let currentSlice: unknown[] = targetMessage;

Expand All @@ -81,21 +150,23 @@ export function set<const TField>(targetMessage: unknown[], schemaField: SchemaL
currentSlice[lastIndexInPath] = value;
}

type PlainObjectOfSchema<TSchema> = TSchema extends IndexedKeysMessageSchema<unknown> ? {
[K in keyof TSchema]: TSchema[K] extends SchemaLeaf<infer TField>
type PlainObjectOfSchema<TSchema> = TSchema extends ValidIndexedKeysMessageSchema<unknown> ? {
[K in keyof TSchema]: TSchema[K] extends ValidSchemaLeaf<infer TField>
? TField
: PlainObjectOfSchema<TSchema[K]>;
: TSchema[K] extends SchemaLeaf<unknown>
? Invalid<"Schema needs to be validated before you use it!">
: PlainObjectOfSchema<TSchema[K]>;
}
: never;

export function toPlainObject<TSchema extends IndexedKeysMessageSchema<TSchemaInner>, TSchemaInner>(
export function toPlainObject<TSchema extends ValidIndexedKeysMessageSchema<TSchemaInner>, TSchemaInner>(
message: readonly unknown[],
schema: TSchema): PlainObjectOfSchema<TSchema> {

const object: Partial<PlainObjectOfSchema<TSchema>> = {};

for (const [fieldName, nestedSchemaNode] of Object.entries(schema)) {
const nestedNode = nestedSchemaNode as IndexedKeysMessageSchema<unknown> | SchemaLeaf<unknown>;
const nestedNode = nestedSchemaNode as ValidIndexedKeysMessageSchema<unknown> | ValidSchemaLeaf<unknown>;
let valueToSet = undefined;
if (isSchemaLeaf(nestedNode)) {
valueToSet = get(message, nestedNode);
Expand All @@ -109,7 +180,7 @@ export function toPlainObject<TSchema extends IndexedKeysMessageSchema<TSchemaIn
return object as PlainObjectOfSchema<TSchema>;
}

export function toIndexedKeysMessage<TSchema extends IndexedKeysMessageSchema<TSchemaInner>, TSchemaInner>(
export function toIndexedKeysMessage<TSchema extends ValidIndexedKeysMessageSchema<TSchemaInner>, TSchemaInner>(
plainObject: PlainObjectOfSchema<TSchema>,
schema: TSchema): unknown[] {

Expand All @@ -118,13 +189,13 @@ export function toIndexedKeysMessage<TSchema extends IndexedKeysMessageSchema<TS
return message;
}

function populateIndexedKeysMessage<TSchema extends IndexedKeysMessageSchema<TSchemaInner>, TSchemaInner>(
function populateIndexedKeysMessage<TSchema extends ValidIndexedKeysMessageSchema<TSchemaInner>, TSchemaInner>(
messageToPopulate: unknown[],
plainObject: PlainObjectOfSchema<TSchema>,
schema: TSchema) {

for (const [fieldName, nestedSchemaNode] of Object.entries(schema)) {
const nestedNode = nestedSchemaNode as IndexedKeysMessageSchema<unknown> | SchemaLeaf<unknown>;
const nestedNode = nestedSchemaNode as IndexedKeysMessageSchema<unknown> | ValidSchemaLeaf<unknown>;
const leafValueOrSubObject = plainObject[fieldName as keyof PlainObjectOfSchema<TSchema>];
if (isSchemaLeaf(nestedNode)) {
set(messageToPopulate, nestedNode, leafValueOrSubObject);
Expand Down
79 changes: 76 additions & 3 deletions test/indexedKeysSchema.test.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
import { describe, it, expect, expectTypeOf } from "vitest";
import { withIndex as i, get, set, toPlainObject, toIndexedKeysMessage } from "../src/index";
import {
withIndex as i,
get,
set,
toPlainObject,
toIndexedKeysMessage,
validateSchema, type ValidIndexedKeysMessageSchema, InvalidSchemaError
} from "../src/index";

const someDate = new Date();
const message = [
Expand Down Expand Up @@ -55,7 +62,7 @@ const messageAsPlainObject = {
describe("get", () => {
it("should return value from the index - and of type - specified by the schema", () => {
const schema = createTestSchema();

const r1 = get(message, schema.anotherNumber);
expectTypeOf(r1).toBeNumber();
expect(r1).to.equal(69);
Expand Down Expand Up @@ -146,9 +153,73 @@ describe("toIndexedKeysMessage", () => {
});
});

describe("validateSchema", () => {
it("shouldn't throw if schema is correct", () => {
createTestSchema();
});

it("should throw if there are duplicate indexes-paths in the schema", () => {
expect(() => validateSchema({
someField: i(0)<number>(),
anotherFieldWithSameIndex: i(0)<number>(),
})).toThrowError(new InvalidSchemaError());

expect(() => validateSchema({
nestedThing: i(0)({
someNestedField: i(0)<string>()
}),
anotherNestedThing: i(0)({
anotherNestedFieldWithSameIndex: i(0)<number>()
}),
})).toThrowError(new InvalidSchemaError());
});

it("should throw if subschema isn't wrapped in withIndex", () => {
const schema = {
someNumber: i(0)<number>(),
anotherNumber: i(1)<number>(),
nestedThing: {
someNestedDate: i(1)<Date>(),
evenMoreNestedThing: i(2)({
moreNestedNumber: i(0)<number>(),
moreNestedBool: i(1)<boolean>(),
moreNestedArray: i(2)<number[]>()
}),
someNestedNumber: i(0)<number>(),
},
someString: i(2)<string>(),
someArray: i(4)<string[]>(),
someBool: i(3)<boolean>()
};

expect(() => validateSchema(schema))
.toThrowError(new InvalidSchemaError());

const schema2 = {
someNumber: i(0)<number>(),
anotherNumber: i(1)<number>(),
nestedThing: i(5)({
someNestedDate: i(1)<Date>(),
evenMoreNestedThing: {
moreNestedNumber: i(0)<number>(),
moreNestedBool: i(1)<boolean>(),
moreNestedArray: i(2)<number[]>()
},
someNestedNumber: i(0)<number>(),
}),
someString: i(2)<string>(),
someArray: i(4)<string[]>(),
someBool: i(3)<boolean>()
};

expect(() => validateSchema(schema2))
.toThrowError(new InvalidSchemaError());
});
})

export function createTestSchema() {
// the order of schema properties is intentionally shuffled here, to test that it doesn't matter
return {
const schema = {
someNumber: i(0)<number>(),
anotherNumber: i(1)<number>(),
nestedThing: i(5)({
Expand All @@ -164,4 +235,6 @@ export function createTestSchema() {
someArray: i(4)<string[]>(),
someBool: i(3)<boolean>()
};

return validateSchema(schema);
}
Loading

0 comments on commit d52f3f2

Please sign in to comment.