Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design Store model #2

Closed
16 tasks
Keith-CY opened this issue Oct 8, 2022 · 10 comments
Closed
16 tasks

Design Store model #2

Keith-CY opened this issue Oct 8, 2022 · 10 comments
Assignees
Labels
enhancement New feature or request

Comments

@Keith-CY
Copy link
Member

Keith-CY commented Oct 8, 2022

  • as an actor factor: Design actor system #4
  • default storage layout
  • pattern to gather cells
  • interfaces
    • new(pattern): fetch cells by the specific pattern and initialize a Store model
    • destroy(): remove data in the Store model
    • duplicate(): create a new Store model from plain cells with the same data
    • sync(blockNumber?): load data with pattern defined on initialization at a specific block number
    • get(path): get a value at the specific path
    • set(path, value): update a value at the specific path
    • delete(path): remove key-value pair at the specific path
  • metadata of a storage
    • schema
    • serialization
    • how to share/propagate metadata
  • lazy evaluation of updates
@Keith-CY Keith-CY added the enhancement New feature or request label Oct 8, 2022
@Keith-CY Keith-CY moved this to 📋 Backlog in CKB JS Backlog Oct 8, 2022
@Keith-CY
Copy link
Member Author

Keith-CY commented Oct 8, 2022

A rough to-do list of Store model was added to this issue. Other models have their own issues

We can talk a bit more about details and I'll enrich the to-do list along with the discussion. @janx @homura @felicityin @yanguoyu

@yanguoyu
Copy link
Contributor

yanguoyu commented Oct 8, 2022

Define cell storage struct to gather by the specific pattern.

@Keith-CY
Copy link
Member Author

Keith-CY commented Oct 9, 2022

Define cell storage struct to gather by the specific pattern.

Does this mean how to match a pattern with a given cell storage struct?

Take the issues in GitHub as an example

If we are building an issue system on-chain and every issue is stored as a cell, it would be structured as

interface Issue {
  open: boolean
  updated_at: timestamp
  labels: Array<string>
  author: User
}

image

While we can filter cells/issues out by a pattern as is:open sort:updated-desc author:@me

image

Would the question be what's the default struct of a cell off-chain and corresponding patterns ?

@yanguoyu
Copy link
Contributor

yanguoyu commented Oct 9, 2022

interface Issue {
open: boolean
updated_at: timestamp
labels: Array
author: User
}
Would the question be what's the default struct of a cell off-chain and corresponding patterns ?

Yeah, on the other hand, how to store this information on the chain? Or how to transform these data to bytecode and store them in the cell?

@Keith-CY
Copy link
Member Author

Keith-CY commented Oct 9, 2022

interface Issue {
open: boolean
updated_at: timestamp
labels: Array
author: User
}
Would the question be what's the default struct of a cell off-chain and corresponding patterns ?

Yeah, on the other hand, how to store this information on the chain? Or how to transform these data to bytecode and store them in the cell?

That's the detail we're going to confirm in this issue.

@Keith-CY Keith-CY assigned Keith-CY and yanguoyu and unassigned Keith-CY Oct 27, 2022
@Keith-CY Keith-CY added this to the 2022/10/27 - 2022/11/03 milestone Oct 27, 2022
@Keith-CY Keith-CY moved this from 📋 Backlog to 🔖 Ready in CKB JS Backlog Oct 27, 2022
@yanguoyu
Copy link
Contributor

yanguoyu commented Nov 7, 2022

I have designed something about Store like this. Please have a review. @Keith-CY @homura @IronLu233

  1. Define serialize:
type StorageLoc = 'data' | 'witness'
type OneOfStorageLoc<T> = { data: T; witness?: T } | { data?: T; witness: T }
type StorageOffChain = OneOfStorageLoc<any>
type StorageOnChain = OneOfStorageLoc<Uint8Array>
type StorageOffset = Partial<Record<StorageLoc, number>>

abstract class Storage<T extends StorageOffChain> {
  abstract serialize(data: T): StorageOnChain
  abstract deserialize(data: StorageOnChain): T
}

class JSONStorage<T extends StorageOffChain> extends Storage<T> {
  serialize(data: T): StorageOnChain {
    return {
      data: data.data ? Buffer.from(JSON.stringify(data.data)) : undefined,
      witness: data.witness ? Buffer.from(JSON.stringify(data.witness)) : undefined,
    }
  }
  deserialize(data: StorageOnChain): T {
    const dataStr = data?.data?.toString()
    const witnessStr = data?.witness?.toString()
    return {
      data: dataStr ? JSON.parse(dataStr) : undefined,
      witness: witnessStr ? JSON.parse(witnessStr) : undefined,
    } as T
  }
}

Make assume, a Dapp's scheme is unique. Then we can transfer it to kuai as a parameter.
Let me define a entry class:

declare function hexToBytes(rawhex: string | number | bigint): Uint8Array 

class App<T extends StorageOffChain> {
  storageInstance: Storage<T>
  offset: StorageOffset

  constructor(type: Storage<T> = new JSONStorage<T>(), offset?: StorageOffset) {
    this.storageInstance = type
    this.offset = offset
  }
  sync(tx: CKBComponents.Transaction) {
    //Get State from tx by S and transfter State to Store by message
    const data = tx.outputsData
    const witness = tx.witnesses
    for (let i = 0; i < data.length; i++) {
      const item = data[i];
      // contract data and witness
      const curData = data[i].slice(this.offset.data || 0)
      const curWitmess = witness[i].slice(this.offset.witness || 0)
      const dataOffChain = this.storageInstance.deserialize({
        data: hexToBytes(curData),
        witness: hexToBytes(curWitmess)
      })
      // send message to actors to sync data
    }
  }
}

Then Dapp developers can transfer schema like this:

type BitSchema = {
  'data': {
    name: string
    version: string
  },
  'witness': {
    domain: {
      name: string
      createTime: string
      count: number
      value: BigInt
    }
  }
}

const app = new App<BitSchema>()
app.start()

They can also define customer Storage

class Molecule<T extends StorageOffChain> extends Storage<T> {
  serialize(data: T): StorageOnChain {
    return {
      data: new Uint8Array(),
      witness: new Uint8Array(),
    }
  }

  deserialize(data: StorageOnChain): T {
    return {} as T
  }
}

const moleculeApp = new App<BitSchema>(new Molecule())
  1. Pattern
    I design patterns to match data in the chain by fields and match operations or match by RegExp
    Avoid the patterns becoming complex, I don't design nest patterns.
export interface PatternItem {
  field: string
  match: {
    op: 'eq' | 'lt' | 'gt' | 'lte' | 'gte' | 'in'
    value: any
  } | RegExp
}

export type Pattern = {
  aggregate: 'OR' | 'AND'
  patterns: PatternItem[]
}[] | PatternItem

I can define a pattern to match AAAAA or match BBBBB

[
  {
    aggregate: 'or',
    patterns: [
      {
        field: 'data.domain',
        match: {
          op: 'in',
          value: 'AAAAA'
        }
      },
      {
        field: 'data.domain',
        match: {
          op: 'in',
          value: 'BBBBB'
        }
      }
    ]
  }
]

Or simply define a pattern

{
        field: 'data.domain',
        match: {
          op: 'in',
          value:  new RegExp('^(A|B){5}$')
        }
}
  1. Store
    The Store receives data from the root actor to add or remove data storage by pattern, stores messages in memory, then calculates finally States by messages.
    The Store save States by key -> value, the key is the Outpoint to string, value is deserialized from a transaction by a schema.
export class Store<S extends State = State, M extends Message = Message> extends Actor {
  pattern: Pattern;
  protected states: Record<OutPointString, S> = {}
  private messageList: M[] = []

  constructor(pattern: Pattern) {
    super()
    this.pattern = pattern
  }

  // sync from tx or database
  private addState(addStates: Record<OutPointString, S>) {
    this.states = {
      ...this.states,
      ...addStates
    }
  }

  // sync from tx
  private removeState(keys: OutPointString[]) {
    keys.forEach(key => {
      delete this.states[key]
    })
  }

  duplicate() {
    const store =  new Store(this.pattern)
    // copy states and messages
    return store
  }

  handle(message: M) {
    switch (message.type) {
      case 'add-state':
        // if match pattern
        this.addState(message.data)
        break;
      case 'remove-state':
        // if match pattern
        this.removeState(message.data)
        break;
      default:
        break;
    }
  }

  get(path: StataPath) {
    if (path.path) {
      return this.states[path.key][path.path]
    }
    return this.states[path.key]
  }
  set(path: StataPath, value: any) {
    if (path.path) {
      this.states[path.key][path.path] = value
    }
    this.states[path.key] = value
  }

  remove(path: StataPath) {
    if (path.path) {
      delete this.states[path.key][path.path]
    }
    delete this.states[path.key]
  }

  finalize() {
    this.messageList.forEach(message => {
      this.handle(message)
    })
    return this.states;
  }
}

@Keith-CY
Copy link
Member Author

Keith-CY commented Nov 7, 2022

I have designed something about Store like this. Please have a review. @Keith-CY @homura @IronLu233

  1. Schema:
type StorageType = 'json' | 'molecule'
type StorageLoc = 'data' | 'witness'
export const loc = Symbol('loc')

type DataType = 'string' | 'number' | 'boolean'

export interface Schema {
  [field: string]: DataType | DataType[] | { [field: string]: DataType | DataType[] }
  [loc]?: StorageLoc
}

Make assume, a Dapp's scheme is unique. Then we can transfer it to kuai as a parameter. Let me define a entry class:

class App {
  schema: Schema[]
  storageType: StorageType
  offset: Partial<Record<StorageLoc, number>>
  registerSchema(schema: Schema[], type: StorageType = 'json', offset?: Partial<Record<StorageLoc, number>>) {
    this.schema = schema
    this.storageType = type
    this.offset = offset
  }
  sync(tx: CKBComponents.Transaction) {
    //Get State from tx by S and transfter State to Store by message
    const data = tx.outputsData
    const witness = tx.witnesses
    for (let i = 0; i < data.length; i++) {
      const item = data[i];
      const addStates = []
      const removeStates = []
      const inDataFields = this.schema.filter(v => v[loc] === 'data')
      const inWitnessFields = this.schema.filter(v => v[loc] === 'witness')
      if (inDataFields.length) {
        const offset = this.offset.data || 0
        // Deserialization data by offset and type
      }
      if (inWitnessFields.length) {
        const offset = this.offset.witness || 0
        // Deserialization witness by offset and type
      }
      // send message to actors to sync data
    }
  }
}

Then Dapp developers can transfer schema like this:

const bitSchema: Schema[] = [
  {
    name: 'string',
    version: 'string',
    [loc]: 'data'
  },
  {
    version: 'string',
    [loc]: 'data'
  },
  {
    [loc]: 'witness',
    domain: {
      name: 'string',
      createTime: 'string',
    }
  }
]
const app = new App()
app.registerSchema(bitSchema)
app.start()
  1. Pattern
    I design patterns to match data in the chain by fields and match operations or match by RegExp
    Avoid the patterns becoming complex, I don't design nest patterns.
export interface PatternItem {
  field: string
  match: {
    op: 'eq' | 'lt' | 'gt' | 'lte' | 'gte' | 'in'
    value: any
  } | RegExp
}

export type Pattern = {
  aggregate: 'OR' | 'AND'
  patterns: PatternItem[]
}[] | PatternItem

I can define a pattern to match AAAAA or match BBBBB

[
  {
    aggregate: 'or',
    patterns: [
      {
        field: 'data.domain',
        match: {
          op: 'in',
          value: 'AAAAA'
        }
      },
      {
        field: 'data.domain',
        match: {
          op: 'in',
          value: 'BBBBB'
        }
      }
    ]
  }
]

Or simply define a pattern

{
        field: 'data.domain',
        match: {
          op: 'in',
          value:  new RegExp('^(A|B){5}$')
        }
}
  1. Store
    The Store receives data from the root actor to add or remove data storage by pattern, stores messages in memory, then calculates finally States by messages.
    The Store save States by key -> value, the key is the Outpoint to string, value is deserialized from a transaction by a schema.
export class Store<S extends State = State, M extends Message = Message> extends Actor {
  pattern: Pattern;
  protected states: Record<OutPointString, S> = {}
  private messageList: M[] = []

  constructor(pattern: Pattern) {
    super()
    this.pattern = pattern
  }

  // sync from tx or database
  private addState(addStates: Record<OutPointString, S>) {
    this.states = {
      ...this.states,
      ...addStates
    }
  }

  // sync from tx
  private removeState(keys: OutPointString[]) {
    keys.forEach(key => {
      delete this.states[key]
    })
  }

  duplicate() {
    const store =  new Store(this.pattern)
    // copy states and messages
    return store
  }

  handle(message: M) {
    switch (message.type) {
      case 'add-state':
        // if match pattern
        this.addState(message.data)
        break;
      case 'remove-state':
        // if match pattern
        this.removeState(message.data)
        break;
      default:
        break;
    }
  }

  get(path: StataPath) {
    if (path.path) {
      return this.states[path.key][path.path]
    }
    return this.states[path.key]
  }
  set(path: StataPath, value: any) {
    if (path.path) {
      this.states[path.key][path.path] = value
    }
    this.states[path.key] = value
  }

  remove(path: StataPath) {
    if (path.path) {
      delete this.states[path.key][path.path]
    }
    delete this.states[path.key]
  }

  finalize() {
    this.messageList.forEach(message => {
      this.handle(message)
    })
    return this.states;
  }
}
  1. number in JavaScript is obscure and error-prone because it has a limit, maybe decimal, big number, big decimal are more clear;
  2. Schema could be recursive, as [field: string]: Schema so each field would have a loc attribute to specify whether it's in data or witness.
  3. Why is sync method receiving a transaction. If it's updated by a transaction, there would be a series of updates during a block;

@yanguoyu
Copy link
Contributor

yanguoyu commented Nov 8, 2022

number in JavaScript is obscure and error-prone because it has a limit, maybe decimal, big number, big decimal are more clear;

It's a good suggestion, I will update it later.

Schema could be recursive, as [field: string]: Schema so each field would have a loc attribute to specify whether it's in data or witness.

I think we should make the loc at the outermost. If loc is in recursive data, the data object maybe have some fields that will save in witness. Like this, it may make it confusing.

const bitSchema: Schema[] = [
  {
    name: 'string',
    version: 'string',
    [loc]: 'data',
    domain: {
      [loc]: 'witness',
      name: 'string',
      createTime: 'string',
    }
  }
]

Why is sync method receiving a transaction. If it's updated by a transaction, there would be a series of updates during a block;

It should sync from a block, I just described how it will deserialize from a transaction.

@Keith-CY
Copy link
Member Author

I see the design was updated 2 days ago, is it ready for review again?

@yanguoyu
Copy link
Contributor

Yeah, I have updated the serialization of storage.

@Keith-CY Keith-CY moved this from 🔖 Ready to 🏗 In progress in CKB JS Backlog Nov 17, 2022
@Keith-CY Keith-CY moved this from 🏗 In progress to 👀 In review in CKB JS Backlog Nov 30, 2022
@Keith-CY Keith-CY moved this from 👀 In review to ✅ Done in CKB JS Backlog Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Archived in project
Development

No branches or pull requests

2 participants