-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
swing-store export/restore API (for state-sync) #6773
Comments
We'll need to include an
There may be multiple exports per commit() call, for example in the bootstrap block, or when executing swingset between block. I think in general we should not attach more semantics to commit beyond "make sure you've saved your data and cleanup discarded data as needed".
Again I think these restrictions are too restrictive.
Being pedantic here, but after |
BTW the directory form of this could probably be used to build the vaguely-editable "genesis block export" data structure that @arirubinstein has asked about, for situations where we need to halt chain1, export its state to a giant JSON file, modify that somehow, then launch chain2 from the edited version. You could modify some |
cc @FUDCo |
Likely duplicate of #6562 |
Ok maybe, but please retain the swing-store -centric "how to I propagate my state to a new swing-store" perspective from this ticket (also the API sketch). The first swing-store needs to provide enough data to the first host application, to allow the second swing-store to request enough data from the second host application, to populate enough swing-store state, to allow the second kernel to "resume" from the snapshot. The other ticket's title "API to get summary of swingstore block changes" treats the host application as the principal, whereas I think it's helpful to think of swing-store as the instigator and host-app as dumb carrier of data. |
Very much the plan. Actually the API is mostly based on state export, and the incremental export of KV data is orthogonal.
I agree that we can consider the other issue as a subset of this one.
So we decided to eschew the The current design I have is the following: diff --git a/packages/swing-store/src/swingStore.js b/packages/swing-store/src/swingStore.js
index 94630f935..4ce0a2a61 100644
--- a/packages/swing-store/src/swingStore.js
+++ b/packages/swing-store/src/swingStore.js
@@ -62,6 +62,8 @@ export function makeSnapStoreIO() {
* commit: () => Promise<void>, // commit changes made since the last commit
* close: () => Promise<void>, // shutdown the store, abandoning any uncommitted changes
* diskUsage?: () => number, // optional stats method
+ * setKVDataExportCallback: (callback: (newData: KVDataEntry[]) => void) => void, // Set a `callback` invoked by swingStore when new serializable data is available for export
+ * getExporter(): SwingStoreExporter, // Creates an exporter of the swingStore content from the most recent commit point
* }} SwingStoreHostStorage
*
* @typedef {{
@@ -82,6 +84,57 @@ export function makeSnapStoreIO() {
* }} SwingStore
*/
+/**
+ * @typedef {[
+ * key: string,
+ * value: string,
+ * ]} KVDataEntry
+ *
+ * @typedef {object} SwingStoreExporter
+ * Allows to export data from a swingStore as a fixed view onto the content as
+ * of the most recent commit point when the exporter was created.
+ * The exporter may be used while another SwingStore instance is active for the
+ * same DB, possibly in another thread or process.
+ * It guarantees that regardless of the concurrent activity of other swingStore
+ * instances, the data representing the commit point will stay consistent and
+ * available.
+ *
+ * @property {() => AsyncIterator<KVDataEntry>} getKVData
+ * Get a full dump of KV data from the swingStore. This represent both the
+ * KVStore (excluding host and local prefixes), as well as any data needed to
+ * validate all artifacts, both current and historical. As such it represents
+ * the root of trust for the application.
+ * Likely content of validation data (with supporting entries for indexing):
+ * - lastStartPos.${vatID} = ${startPos}
+ * - transcript.${vatID}.${startPos} = ${endPos}-${rollingHash}
+ * - heap-snapshot.${vatID}.${startPos} = ${hash}
+ *
+ * @property {(options: {includeHistorical: boolean}) => AsyncIterator<string>} getArtifactNames
+ * Get a list of name of artifacts available from the swingStore
+ * A name returned by this method guarantees that a call to `getArtifact` on
+ * the same exporter instance will succeed. Options control the filtering of
+ * the artifact names yielded.
+ * Likely artifact names:
+ * - transcript-${vatID}-${startPos}-${endPos}
+ * - heap-snapshot-${vatID}-${startPos}
+ *
+ * @property {(name: string) => Promise<ArrayBuffer>} getArtifact
+ * Retrieve an artifact by name. May throw if the artifact is not available,
+ * which may occur if the artifact is historical and wasn't been preserved.
+ *
+ * @property {() => Promise<void>} close
+ * Dispose of all resources held by this exporter. Any further operation on
+ * this exporter or its outstanding iterators will fail.
+ */
+
+/**
+ * Function used to create a new swingStore from an object implementing the
+ * exporter API. The exporter API may be provided by a swingStore instance, or
+ * implemented by a host to restore data that was previously exported.
+ *
+ * @typedef {(exporter: SwingStoreExporter) => Promise<SwingStore>} ImportSwingStore
+ */
+
/**
* A swing store holds the state of a swingset instance. This "store" is
* actually several different stores of different types that travel as a flock It uses a unified exporter interface for the host to get data from a swingStore to generate state-sync snapshots, as well as to create a new swingStore from either an existing swingStore, or from restored state-sync snapshot artifacts. The consumer drives the consumption of artifacts, deciding which artifacts are needed depending on the kind of usage (state-sync, shallow restore, full restore). The iterator approach that was previously considered would have forced the consumer to evaluate each artifact offered by the exporter, and decide whether it was needed or not for the use case. I did not want to expose file system concerns in this API and preferred keeping it at the level of kv-data, artifact name and opaque data. It's fairly straight-forward to implement a consumer that uses the exporter to write files to disk, or implement the exporter API based on reading data from a directory. The |
That sounds pretty good. I'm guessing that In discussion wit @FUDCo, I've been describing this key/value dataset as the "shadow table", both to avoid confusion with The successor swing-store gets the full contents of the shadow table first (and it is allowed to rely upon the contents being accurate and complete). Then it gets to work on artifacts, and must compare each alleged artifact against some hashes kept in the shadow table. Between the shadow table and the artifacts, the successor must be able to repopulate all the required data. So a big chunk of the shadow table will be filled with Chip and I figured that we'd need a shadow-table entry for every historical transcript span, forever, so that we retain the ability to validate spans pulled from an archive node in the case of a sleeper-agent upgrade situation. And we might retain an entry for every heap snapshot, just in case. Then there's an entry with the cumulative hash of the active transcript span for each vat (overwritten after every delivery), and likewise for the most recent heap snapshot for each vat. As we add more tables to swing-store, we either shadow their contents into the shadow table, or we maintain a hash of their contents in the shadow table and prepare an artifact with the contents upon request. Make sure there's a comment in swing-store with a schema for this shadow table, like the one for |
A question arising during implementation: instead of or in addition to the |
I think it's acceptable yes, and probably better. I put it on the host facet since it felt like it fit there along
|
Here's some theory-of-operation documentation that I should have written when we started this effort. Finally writing it down is helping to clarify my thinking about the API in PR #7026, so I figured I'd do it before diving into the review. If it survives discussion, I'll make an additional PR to include the docs in the export API changes (creating SwingStore Data Import/ExportThe "SwingStore" package provides the database-backed storage component that each SwingSet kernel uses to hold all necessary state. This includes message queues, c-list tables, XS heap snapshots, and vat delivery transcripts. The host application is responsible for creating a swingstore instance and passing it to the new kernel, and for committing the store's database at the appropriate point in the execution cycle. Some applications may want to record their state changes in a way that can be cloned, to create new instances of the application. For example, a blockchain may consist of many "validators", each of which holds a replica of (hopefully) identical SwingSet kernel state, and we need a way to launch new validators and bring them quickly and cheaply up-to-date with the existing ones. We want the old validators to publish their SwingSet state, and for a prospective new validator node to be able to download this state as a starting point, rather than needing to replay the entire transaction/transcript history of the chain from the beginning. This data may follow an untrusted path, so the new node must be able to rely upon (or validate) the data it receives. Typically there is a "block root hash" which they use as a starting point (which they either accept on faith from their operator, or which they somehow test against chain voting rules), then they can validate additional data against this root hash. Blockchain platforms like cosmos-sdk have tools to implement "state-sync", so the library will handle data formatting and distribution. But at the application layer, we must provide the SwingStore state to this library in a suitable format. The cosmos-sdk state-sync tools require that 1: every block includes a commitment to the entire state of the application, and 2: every once in a while (perhaps once per day) the application will be asked for a set of "export artifacts". The combination of the current block's commitment and the export artifacts should be sufficient for a new participant to receive a state vector that can be safely validated against the current chain state. Each SwingStore instance provides methods to facilitate this state export, and then to build a new SwingStore from the exported dataset. There is one set of methods to perform one-off full exports of the state. To facilitate consensus machines, a second set is provided to perform incremental export of just the validation data, allowing the (large) remaining data to be exported only on rare occasions. Two Stages: Export Data and Export ArtifactsThe SwingStore export protocol defines two stages (effectively two datasets). The contents of both are private to the SwingStore (the host application should make no assumptions about their contents or semantics). The first stage is called the "export data", and contains a set of key-value pairs (both strings, TODO blobs?). The second is a called the "export artifacts", each of which has a name (a string), and contains a blob of bytes. In general, the artifact blobs are much larger than the first-stage export data values, and take more time to generate. Host applications will typically not access the second-stage export artifacts until after the swingstore Each time a SwingStore API is used to modify the state somehow (e.g. adding/changing/deleting a These export data/artifact changes can happen when calling into the kernel (e.g. invoking the external API of a device, causing the device code to change its own state or push messages onto the run-queue), or by normal kernel operations at it runs (any time Among other things, the swing-store records a transcript of deliveries for each vat. The collection of all deliveries to a particular vat since its last heap snapshot was written is called the "current span". The first-stage export data will record a single record for each vat that remembers the extent and the hash of the current span. This record then refers to a second-stage export artifact that contains the actual transcript contents. When a delivery is made, a new entry is appended to the end of the current span. This updates (replaces) the record in the first-stage export data: the new record has a longer extend (the To clone a SwingStore, the host application must extract both stages from the source copy, and somehow deliver them to a new instance of the application, which can feed both datasets into a new SwingStore. When complete, the destination SwingStore will have the same contents as the original, or at least enough to continue execution from the moment of copy (it may be lacking optional/historical data, like non-current vat transcripts from before the most recent heap snapshot). The host application is responsible for delivering both datasets, but it is only responsible for maintaining the integrity of the first stage export data. This table contains enough information to validate the contents of the export artifacts. The new clone is entirely reliant upon the contents of the first stage: if someone can manage to corrupt its contents, the new clone may be undetectably and arbitrarily corrupted. But as long as the first stage was delivered correctly, any changes to the second stage export artifacts will be discovered by the new SwingStore, and the import process will abort with an error. This split reduces the cost of supporting occasional state-sync export operations, as described below. Full ExportThe simplest (albeit more expensive) way to use SwingStore data export is by creating an "exporter" and asking it to a one-off full export operation. The exporter is created by calling After calling const dirPath = '.../swing-store';
const swingStore = openSwingStore(dirPath);
...
await controller.run();
hostStorage.commit();
// spawn a child process
// child process does:
const exporter = makeSwingStoreExporter(dirPath);
// exporter now has a txn, parent process is free to proceed forward
const exportData = new Map();
for (const [key, value] of exporter.getExportData()) {
if (value) {
exportData.set(key, value);
} else {
exportData.delete(key);
}
}
const exportArtifacts = new Map();
for (const name of exporter.getArtifactNames()) {
exportArtifacts.set(name, exporter.getArtifact(name));
}
// export is 'exportData' and 'exportArtifacts' When doing a complete export, the Note that the new DB transaction is created during the execution of Incremental ExportThe full export can be useful for backing up a "solo" swingset kernel, where consensus among multiple nodes is not required. However the more common (and complicated) use case is in a consensus machine, where multiple replicas are trying to maintain the same state. SwingStore offers an "incremental export" mode that is designed to work with the cosmos-sdk state-sync protocol. In this protocol, every block must contain enough information (hashes) to validate the entire state-sync dataset, even though most blocks are not used for for state-sync (and only a very few replicas will volunteer to create state-sync data). All validators vote on the block hashes, and these blocks are widely reported by block explorers and follower/archive nodes, so it is fairly easy to answer the question "is this the correct root hash?" for an arbitrary block height. When someone wants to launch a new validator, they ask around for an available state-sync snapshot. This will typically come from an archiving node, which produces a new snapshot each day. The archive node will report back the block height of their latest state-sync snapshot. The new validator operator must acquire a valid block header for that height, doing their own due diligence on the correctness of that header (checking its hash against public sources, etc). Then they can instruct their application to proceed with the state-sync download, which fetches the contents of the state-sync snapshot and compares them against the approved block header root hash. So, to include SwingStore data in this state-sync snapshot, we need a way to get the first-stage export data (including its validation hashes) into every block, as cheaply as possible. We defer the more expensive second-stage export until a state-sync producing node decides it is time to make a snapshot. To support this, SwingStore has an "incremental export" mode. This is activated when the host application supplies an "export callback" option to the SwingStore instance constructor. Instead of retrieving the entire first-stage export data at the end of the block, the host application will be continuously notified about changes to this data as the kernel executes. The host application can then incorporate those entries into an existing hashed Merkle tree (e.g. the cosmos-sdk IAVL tree), whose root hash is included in the consensus block hash. Every time the callback is given All validator nodes use this export callback, even if they never perform the rest of the export process, to ensure that the consensus state includes the entire first-stage dataset. (Note that the first stage data is generally smaller than the full dataset, making this relatively inexpensive). Then, on the few occasions when the application needs to build a full state-sync snapshot, it can ask the SwingStore (after block commit) for the full set of artifacts that match the most recent commit. const dirPath = '.../swing-store';
const iavl = ...;
function exportCallback(key, value) {
const iavlKey = `ssed.${key}`; // 'ssed' is short for SwingStoreExportData
if (value) {
iavl.set(iavlKey, value);
} else {
iavl.delete(iavlKey); // value===undefined means delete
}
}
const swingStore = openSwingStore(dirPath, { exportCallback });
...
await controller.run();
hostStorage.commit();
// now, if the validator is configured to publish state-sync snapshots,
// and if this block height is one of the publishing points,
// do the following:
// spawn a child process
// child process does:
const exporter = makeSwingStoreExporter(dirPath);
// note: no exporter.getExportData(), the first-stage data is already in IAVL
const artifacts = new Map();
for (const name of exporter.getArtifactNames()) {
artifacts.set(name, exporter.getArtifact(name));
}
// instruct cosmos-sdk to include 'artifacts' in the state-sync snapshot ImportOn other end of the export process is an importer. This is a new host application, which wants to start from the contents of the export, rather than initializing a brand new (empty) kernel state. When starting a brand new instance, host applications would normally call // this is done only the first time an instance is created:
import { openSwingStore } from '@agoric/swing-store';
import { initializeSwingset } from '@agoric/swingset-vat';
const dirPath = './swing-store';
const { hostStorage, kernelStorage } = openSwingStore(dirPath);
await initializeSwingset(config, argv, kernelStorage); Once the initial state is created, each time the application is launched, it will build a controller around the existing state: import { openSwingStore } from '@agoric/swing-store';
import { makeSwingsetController } from '@agoric/swingset-vat';
const dirPath = './swing-store';
const { hostStorage, kernelStorage } = openSwingStore(dirPath);
const controller = await makeSwingsetController(kernelStorage);
// ... now do things like controller.run(), etc When cloning an existing kernel, the initialization step is replaced with import { importSwingStore } from '@agoric/swing-store';
const dirPath = './swing-store';
const exporter = {
getExportData() { // return iterator of [key,value] pairs },
getArtifactNames() { // return iterator of names },
getArtifact(name) { // return blob of artifact data },
};
const { hostStorage } = importSwingStore(exporter, dirPath);
hostStorage.commit();
// now the swingstore is fully populated Once the new SwingStore is fully populated with the previously-exported data, the host application can use Optional / Historical DataSome of the data maintained by SwingStore is not strictly necessary for kernel execution, at least under normal circumstances. For example, once a vat worker performs a heap snapshot, we no longer need the transcript entries from before the snapshot was taken, since vat replay will start from the snapshot point. We split each vat's transcript into "spans", delimited by heap snapshot events, and the "current span" is the most recent one (still growing), whereas the "historical spans" are all closed and immutable. Likewise, we only really need the most recent heap snapshot for each vat: older snapshots might be interesting for experiments that replay old transcripts with different versions of the XS engine, but no normal kernel will ever need them. Most validators would prefer to prune this data, to reduce their storage needs. But we can imagine some extreme upgrade scenarios that would require access to these historical transcript spans. Our compromise is to record validation data for these historical spans in the export data, but omit the spans themselves from the export artifacts. Validators can delete the old spans at will, and if we ever need them in the future, we can add code that will fetch copies from an archive service, validate them against the export data hashes, and re-insert the relevant entries into the SwingStore. The In the future, we will arrange the SwingStore SQLite tables to provide easy Implementation DetailsSwingStore contains components to accomodate all the various kinds of state that the SwingSet kernel needs to store. This currently consists of three portions:
Currently, the SwingStore treats transcript spans and heap snapshots as export artifacts, with hashes recorded in the export data for validation (and to remember exactly which artifacts are necessary). The If some day we implement an IAVL-like Merkle tree inside SwingStore, and use it to automatically generate a root hash for the |
Thanks for writing this up. We should add a section about the preferred determinism for the artifacts data themselves. It's not mandatory for this scheme to work, but the underlying cosmos-sdk and tendermint protocol work a lot better if the data is the same across validators (state-sync snapshot chunks can be fetched from any tendermint node). Notes:
We decided that strings were sufficient for now. The vstorage API currently only supports strings (the underlying DB used by cosmos actually supports bytes if we need in the future).
The sentence reads out of place given we haven't gotten into the scheduling details. Also "typically" is too weak. The host application will never see or request artifacts before they're committed.
Nit: const exportData = new Map();
for (await const [key, value] of exporter.getExportData()) {
if (value) {
exportData.set(key, value);
} else {
exportData.delete(key);
}
}
const exportArtifacts = new Map();
for (await const name of exporter.getArtifactNames()) {
const artifactData = await buffer(exporter.getArtifact(name));
exportArtifacts.set(name, artifactData);
}
// export is 'exportData' and 'exportArtifacts'
Is it ok for such an exporter to carry these multiple assignments and deletions, and feed them into as-is to the importer?
I'm not sure about the correctness of this assumption, but it's not material anyway. I don't know which type of nodes are configured to carry state-sync snapshots, but "archive nodes" seem restrictive.
Again, not quite. The new node must only configure a trusted height and app hash, but that does not need to be the same height as existing state-sync snapshots, just a "root of trust". It needs to be a past height after which it's ok receiving a state-sync snapshot. Once a client discovers a state-sync snapshot through tendermint, it retrieves and validates the app hash for that snapshot height using the configured RPC server and the trusted height app hash. The effect is the same, the app hash for the snapshot height is explicitly trusted (through launch configuration), but the state-sync snapshot discovery is more flexible (uses tendermint p2p connections).
Currently it's not a constructor option but an explicit method call, but I'm ok either way
Nit:
Async iterators again here.
And here: import { importSwingStore } from '@agoric/swing-store';
const dirPath = './swing-store';
const exporter = {
getExportData() { // return async iterator of [key,value] pairs },
getArtifactNames() { // return async iterator of names },
getArtifact(name) { // return stream (async iterator of chunks) of artifact data },
};
const { hostStorage } = await importSwingStore(exporter, dirPath);
hostStorage.commit();
// now the swingstore is fully populated
We need to be explicit about |
What is the Problem Being Solved?
To support
agd
"state-sync" (#3769, #5542, #5934), we need the cosmos-side IAVL tree to contain enough information to restore a copy of the Swingsetswing-store
DB. For some data, we can store an exact copy in the IAVL tree: this uses extra disk space for the redundant copy, but we get state-sync distribution and validation for free (cosmos already knows how to publish and validate the IAVL contents). For other data, we can store a hash in the IAVL tree, and supply a hashed artifact later.To make this work cleanly with the swingset/swing-store architecture, we're talking about an "export/restore" API for swing-store. Just like how SQLite has a
.dump
command that exports the entire DB in a simple text format (SQL commands which can be restored by.read
),swing-store
will have an API that lets you export the contents in a format that is convenient for the host application to store in a different DB, and/or distribute with artifacts later.If speed/space performance were no concern, the design would be
swingStore.export(directoryWriter)
, which would be given an authority to write arbitrary files to a specified directory tree. This directory full of files would contain the complete contents of the kvStore, the streamStore, and the snapStore. TheninitSwingStore(dirPath)
would grow a companion API namedimportSwingStore(dirPath, directoryReader)
, that would take the directory of files and produce a new (but fully populated) swingStore DB from the previously-exported contents.(if we didn't care about determinism either, we could just run
sqlite3 .dump swingstore.sqlite >export.sql
and call it a day)We might still include
.export
for testing or other use cases, but our chain imposes some constraints which would make that impractical for normal use. The cosmos state-sync design requires modules to commit to their contents for every block, even though state-sync export happens on a much slower schedule (perhaps once per day). The state-sync contents must be fully validatable from the block headers (i.e. the IAVL root hash). For chains that keep all of their state in IAVL, this happens automatically, but when modules maintain data outside of IAVL, they're generally required to record a hash of that data into IAVL and then be prepared to produce an artifact (blob) that can be validated by that hash. These artifacts are requested right away, immediately after the IAVL commit, but the module is allowed to take a long time to produce them, and the production work runs in parallel with ongoing block production, so the chain does not slow down. Each state-sync provider is allowed to produce these snapshots on their own schedule, outside of consensus.In addition, our cosmic-swingset layer calls the swing-store
commit()
method outside of the context of a block (after it calls theendBlock()
method), so it can no longer perform IAVL writes at that point. This rules out the most natural approach, where cosmic-swingset would do:Data Validation
IAVL is a Merkle tree, and constantly updates its root hash. This means everything in IAVL can be easily validated against the root hash, which is included in each block header as the AppHash (more or less). So state-sync clients can fetch a copy of the IAVL data from an untrusted provider, populate a new IAVL instance with it, compute the root hash from that data, and then compare it against the consensus-verified AppHash. Clients do not proceed unless that hash matches, at which point they can rely upon the IAVL contents.
Cosmos state-sync provides a way for modules to publish additional artifacts in the state-sync snapshots (#5542). The requirement is that clients will be able to validate alleged copies of these artifacts upon receipt. Clients will fetch both the IAVL data and the other artifacts, then they'll verify IAVL root hash, then they must verify the other artifacts against data stored in the IAVL tree. Note that it doesn't hash to use an exact hash of the artifact for this purpose: the artifact might be compressed, or formatted differently, and the client may need to unpack or rearrange it before verification can happen (and before it can be used). The real requirement is that the unpacked form matches the data approved by consensus, so that an attacker cannot inject invalid data by supplying a malicious artifact.
For example, the swingset transcript store (
streamStore
) is constantly appending entries to the most recent span (the deliveries made since the last XS heap snapshot). We'll maintain a rolling hash of these entries:This takes constant time to update, where as a hash of the entire (growing) span would cost
O(N)
. The IAVL tree records the most recenthash_N
value, updated on every block.The artifact is the list of[delivery_0, delivery_1, ..]
entries. The client receives the IAVL tree (and validate it), then examines the delivery-list artifact. The client computeshash_0
,hash_1, ..
hash_N(which takes O(N) time), and compares it against the IAVL-provided value. At the end of the process, the client knows that all
delivery_0, .. delivery_Nentries are correct, even though nothing computed
sha256(delivery_0 + delivery_1 + ..)`.The swingset XS heap snapshot store (
snapStore
) records tuples of(vatID, startPos, heapSnapshotData, snapshotID)
, where the data is a compressed blob, and the ID is a hash of the uncompressed blob. We'll record the(vatID, startPos, snapshotID)
tuples in the IAVL tree, and we'll use the compressed blob as the published state-sync artifact, published using the ID as a filename/artifact-name. The client must determine which snapshots are expected (one per vat, with the higheststartPos
), and for each of those, it should look for an artifact with the matching name. It must then decompress the artifact, hash the results, and validate that the results match the expected ID. Then it should recompress the decompressed data (to remove any sneaky attacks or variance arising from the compression format), and store the newly compressed data into thesnapStore
.Description of the Design
Swingstore will be responsible for providing "exports" of its contents, when requested, at boundaries that correspond to blocks (one sampling point per
commit()
call). It will define an "export directory format", which is how the contents can be expressed as a directory full of files. This format is entirely up to swingstore (opaque to outsiders). It should be versioned: later versions of swingstore are not obligated to accept older exports, but it should error out cleanly, without risk of corrupted/confused data.Swingstore will also define an "incremental export format", with similar opaqueness/versioning characteristics. This format will consist of the "export key-value pairs" and a set of named artifact blobs. The "export directory format" should be trivially convertable to/from this incremental format. The export keys are likely to be derived from the
kvStore
/snapStore
/etc keys, e.g.kvStore.set('foo', 'bar')
might result in an export key name ofkvStore.foo
. However the swingstore is free to use whatever key names it likes, and it is likely to produce a lot of keys that do not directly correspond to single entries in the variousswingStore
components, for validation hashes and metadata about which artifacts are required.To write the export directory format, a new
swingstore.export(exportPath)
API will be added (or something that takes a suitably-limitedwrite
authority). This can be called at the appropriate time (outside of the "window") and the swingstore will immediately (any perhaps synchronously, TBD) write out the contents.To import the directory format, a new
importSwingStore(ssPath, exportPath)
API will be added, as a module-level export, a sibling ofopenSwingStore
/initSwingStore
).Incremental Export
When opening a swingstore (
openSwingStore
/initSwingStore
), we'll add an option that enables the export feature, which allows the store to avoid work if the feature is not enabled. It also turns on an assertion to make sure the contents are not changed outside of the open/close window described below (to ensure that all changes are included in the export data). The value of this option will be a host-application -provided callback function (working namedataExportCallback
), to receive the incremental export data.Then we'll add a pair of swingstore APIs, working names are
openExportWindow
andcloseExportWindow
(but obviously TBD). The host application is required to sandwich their swingstore and kernel usage in the following pattern:ss.openExportWindow()
controller.run()
, etcss.closeExportWindow()
ss.commit()
ss.getExportArtifacts()
The
open
call simply sets the internal flag which says "modifications are allowed now". All normal swingstore APIs (kvStore.set
, etc) will assert that this flag istrue
, with an error message pointing the user to callopenExportWindow
.While changes are being made, swingstore might call
dataExportCallback(pairs)
with a list of[key, value]
or[key, undefined]
pairs (we useundefined
to delete a key). Bothkey
andvalue
will be strings. These represent the "export key-value pairs" from the swingstore incremental export format. The host app is responsible for accumulating all these pairs (deleting when appropriate) and including them in the state-sync data.The
closeExportWindow()
function clears the internal flag, and may be a opportunity for swingstore to finalize internal data structures, perfoming some last calls todataExportCallback
. The kernel is not allowed to make swingstore calls aftercloseExportWindow
is invoked, and the swingstore must not invokedataExportCallback
aftercloseExportWindow
returns.If the host app determines that this particular block is the right time to produce a state-sync artifact, it will call
ss.getExportArtifacts()
afterss.commit()
. It is obligated to callgetExportArtifacts
before the next call toopenExportWindow
, otherwise swingstore is free to discard data, making it impossible to recover those artifact blobs. LikewisegetExportArtifacts
must be called before the process terminates: swingstore is allowed to hold export pointers in ephemeral RAM that are not included in the durable database. In general, swingstore will seek to discard data as aggressively as possible, andgetExportArtifacts
is how the host-app signals that it needs some data to be retained long enough to be put into a state-sync artifact.getExportArtifacts
will return (TBD) an iterator of[name, dataIterator]
pairs. Thename
corresponds to a filename in the export directory format, and thedataIterator
should yield a binary blob (in reasonably-sized chunks) whose contents should be written to that file. The host application is not obligated to record these artifacts in that fashion, however the restore process will expect them to appear in a directory in this format.The artifacts will include one entry for each vat which holds the most recent heap snapshot (validated by a export-key entry with the hash of the uncompressed contents, plus vatid/endpos metadata) plus an entry for the span of that vat's transcript entries since the snapshot point (validated by the cumulative hash of entries within the span). It might hold an entry for each earlier span.. we're still TBD about whether to include the historical ones or not, a tradeoff between state-sync size / new-client startup time, versus retaining an ability to do larger-scale replay-based upgrades without first consulting an archive node (possibly unavailable) to fetch the missing-but-hashed spans.
Import Time
When a new validator wants to start from a state-sync snapshot, the cosmos side will fetch the IAVL tree and all artifact blobs that were created at export time. It will then call into per-module hooks to offer them access to these blobs. The
x/swingset
module hook needs to populate an export-directory with all of this data, then callimportSwingstore()
pointing at the directory. Once complete, we should have a fully-populated swingstore, and we can launch a kernel against it (skippinginitializeSwingset()
, also skipping the bootstrap block).Alternatively, we might build an incremental import API, to match the incremental export API. In this approach,
importSwingstoreIncrementally()
might be given an iterator of export-key-value entries to start with. It would drain the iterator, populatingkvStore
, but also accumulating a list of other data that it needs, including a list of transcript and heap-snapshot blobs. The API would also be given a callback which swingstore could use to ask for artifact blob contents. The cosmos-sidex/swingset
and/or the cosmic-swingset JS code would expect for swingstore to pull all the blobs that were referenced by export-key-value entries, and for swingstore to perform validation of those blobs before writing them into thestreamStore
andsnapStore
tables. (A lot of this depends upon how exactly the cosmos state-sync hooks work). Having an incremental API would remove the need for some code outside of swingstore to understand enough about the directory format to populate one, preserving the opacity of that format.Security Considerations
The most important consideration is that the data written into the swingstore is fully validated against the root AppHash which was verified against the chain (approved by voting of the right set of validators). The state-sync artifacts contain alleged copies of this data, with various formatting changes, but the
import
process is responsible for verifying the contents before they are written into SQLite.There are lots of opportunities to fail here: plenty of systems have accidentally trusted a complicated transfer format without realizing what sorts of attack vectors they've opened up, especially because a missing validation check doesn't cause visible functional problems. There's no good way to automatically test for this: it requires careful design and careful auditing.
The swingstore
import
API should return a Promise that rejects if any validation check failed. It should probably also delete the partially-initialized DB. In addition, we should not write anything into SQLite until after that piece has been validated.Test Plan
Lots of unit tests, mostly in
packages/swingstore
, but also in cosmic-swingset.The text was updated successfully, but these errors were encountered: