PredixSDK Database Guide

Concepts

NoSQL Database

The Predix SDK Database utilizes a Document Store type of "NoSQL" or non-relational database. The central concept of this type of NoSQL database is the "document". The document encapsulates and organizes data in name/value pairs; and a document itself is identified by a system-wide unique identifer.

At a document's top level, each name/value pair must have a name (a unique string), and a value. The value may be either a simple type like a string, integer, boolean, etc; or a complex type such as an array or another name/value pair dictionary. Ultimately documents are encoded into a JSON format, so only JSON-compatable types are allowed in documents.

Async interaction

To maximize mobile device resources and UI responsiveness, the PredixSDK Database API is largely asynchronous in nature. The API calls for most document interactions and running queries takes a Swift completion handler closure as a parameter. When the requested operation is complete, this closure is then called. This is similar to many other APIs within the Apple ecosystem. Developers using the PredixSDK should ensure they have a thorough understanding of asynchronous patterns, GCD, and Swift closures to enable the maximum performance and responsiveness from their PredixSDK-based applications.

Replication

The PredixSDK Database can be used as a local data store, and also can be used with the PredixSync service to syncronize documents to and from a backend server and to other users. Data replication can be a complex task, but the Predix system makes it easy to setup and provides flexibility for many complex data interation use cases. As with any offline data interaction, care should be taken to determine what data a user needs when offline. It is important when designing your applications data model to consider that mobile devices have limited storage capacity, and often have limited network bandwidth. A balance must be struck between having enough information to meet the applications use cases, and preventing an application from being too "chatty" or attempting to syncronize large quantities of data.

Indexes and Queries

The PredixSDK Database system has the capability to create indexes of the documents its Document Store, and then run queries against those indexes. This can enable searches, sorting, summarization, and other more relational database-like data interactions. An index consists of a key (the data being indexed) and an optional value (additional data stored with the index). The data in the index is sorted based on key, and queries can retrieve data from the index by either specifying a list of keys, or a range of keys. These keys can be simple data types, or arrays to provide more complex sorting and retrieval scenarios.

General Database Topics

Database Getting Started

Opening a database

Before any database interactions can take place, the database must be opened. This involves accessing or creating the physical on-device files of the database. To open a database, a configuration is provided to the API which defines the database. This configuration includes the local file system URL of the database, as well as the database name, and may include a list of indexes to create within the database.

Example:

	let configuration = Database.OpenDatabaseConfiguration.default
	
	do {
	    let database = try Database.open(with: configuration, create: true)
	    if let database = database {
	        // ... the database is now open, and ready for interactions
	    }
	} catch let error {
	    // ... handle error...
	}

As you can see in the example above, the open database API takes an OpenDatabaseConfiguration structure as a parameter, and can throw in the event of errors. An example of when open would throw is if the configuration is invalid, for example an illegal database name is provided, or the file location could not be written to.

Since a database refers to a physical set of files on the device, multiple calls to open the same database result in returning the same Database object. It is advised to avoid retaining multiple references to the same database. If a database has already been opened, a reference to it can be retrieved using the openedWith function.

Default Configuration

Note in the example above, the configuration is retrieved via a default property. This pre-configured property of the OpenDatabaseConfiguration returns a fully configured structure, using the default file location, and the default name, with no indexes defined.

Closing a database

When database operations have completed, it is important to close the database. This ensures the physical database files are properly closed, ceases any ongoing replication tasks, and frees up device resources. Once closed, no database API methods should be called on the database object. It is advised to set the database object to nil or ensure it goes out of scope after calling close.

Remember too, that since a database refers to a physical set of files on the device, multiple calls to open the same database configuration result in multiple references to the same Database object. Once that database is then closed, care should be taken to ensure no calls are made to any other references to that database.

Prepare for shutdown

When an application is about to shut down, it may be helpful to call Database.prepareForShutdown(). This method will release all database resources, and close all open databases in a single command.

Deleting a database

Deleting a database will first close the database, then remove the physical database files from the device. Once deleted, the local database no longer exists, can cannot be recovered.

Documents

As mentioned above, Documents are the central data structure of the PredixSDK database. The Document obect is a type of dictionary where the dictionary keys must be strings. Accessing values from a dictionary is just like accessing values from any other Dictionary class in Swift.

let myValue = myDocument[myKey]

Creating documents

There are several initializers for Documents:

Creates a Document object with the provided name/value pairs:

let document: Document = ["aString": "string data", "anInt": 123, "aDouble": 3.14]

Creates a Document object with no data:

let document = Document()

Creates a Document object with no data whose document id will be "my_document":

let document = Document(id: "my_document")

Creates a Document object from the provided [String: Any] type dictionary:

let dictionary: [String: Any] = ["aString": "string data", "anInt": 123, "aDouble": 3.14]
let document = Document(dictionary)

Creates an optional Document object from the provided [AnyHashable: Any] type dictionary, as long as all the dictionary keys are strings:

let dictionary: [AnyHashable: Any] = ["aString": "string data", "anInt": 123, "aDouble": 3.14]
let document = Document(dictionary)

Creates an optional Document object from the provided JSON data:

let jsonData = retrieveJSONData()
let document = Document(json: jsonData)

Creates an optional Document object from the provided JSON string:

let document = Document(json: "{/"aString/": /"string data/", /"anInt/": 123, /"aDouble/": 3.14}")

When document is created without providing an Id, one will be automatically generated for the document.

Interacting with document data

Interacting with the document's data is just like interacting with any Dictionary:

let document1: Document = ["aString": "string data", "anInt": 123, "aDouble": 3.14]
let stringValue = document1["aString"]

// The value of "stringValue" will be "string data"

Document data can be changed just as simply:

let document1: Document = ["aString": "string data", "anInt": 123, "aDouble": 3.14]
document1["aString"] = "new value"

let stringValue = document1["aString"]
// The value of "stringValue" will be "new value"

// this will add a new key/value pair:
document1["anotherKey"] = "A new key/value pair"

And, of course, Document data can be iterated over:

let document1: Document = ["aString": "string data", "anInt": 123, "aDouble": 3.14]

for (key, value) in document1 {
	print("key: \(key) : value: \(value)")
}

// Will print: 
// 		key: aString : value: string data
// 		key: anInt : value: 123
// 		key: aDouble : value: 3.14

Metadata

All Documents have associated metdata, this data is separated from the main document dictionary access, but can be accessed via the metaData property of the document. Some, but not all of the metadata is readonly after document creation.

Useful metadata:

id : Unique identifier of the Document
createDate : Date the Document was created, if available
lastChange : Date the Document was last saved, if available
type : User-specified string, useful for organizing documents
channels : String array of channels, used in replication to control document access

Metadata can be added to a document at Document initialization, by including it in the document data:

let document1: Document = ["id": "my_document_id"]

print(document1.metaData.id) 

// Will print:
// my_document_id

The metadata properties type and channels are read/write properties, and can be updated. All other metadata is read-only.

Attachments

Documents can have associated blobs of data called attachments. This data is saved and replicated with the document. It is useful for images, sound clips, videos, etc. They can be created from either a Data object, or a URL. Attachments are accessed via the attachments array property of the document.

Care should be taken to ensure attachments are not too large, as having a lot of very large attachments will take up device space, as well as take longer to syncronize for databases replicating with PredixSync. Additionally PredixSync has limitations on the size of individual documents, this includes the attachments. So a document containing two 10MB attachments would be considered over a 20MB document on PredixSync.

Subclassing

While data access from Documents is simple using name/value pairs, Documents are designed to be subclassed for more specific data models. A subclassed document can expose properties that make sense for that data model, and ensure at initialization time the required properties are included. Developers are encouraged to subclass Documents as needed.

Document Database interaction

All manipulation of a Document object is in memory, documents must be saved in order to persist that Document to the Database, or fetched to retreive a document from the Database.

Database methods that write generally return an enumeration UpdateResult enumeration, which on success includes an updated Document object. On failure, the enumeration contains an Error with error details.

Saving documents

The Database save method will save a document to the database and the completion handler will be passed an UpdateResult enumeration containing the saved document:

let database = Database.openedWith(Database.Configuration())
let document1: Document = ["aString": "string data", "anInt": 123, "aDouble": 3.14]

database.save(document1) { result in
	switch result {
		case .success(let savedDocument):
			print("Saved document with id: \(savedDocument.metadData.id)")
		case .failed(let error)
			print("Error saving document: \(error)")
	}
}

Adding and Updating vs Saving

For cases where a developer does not care if a Document is created or updated, the save method will automatically determine if the document already exists, and if so update it, otherwise add it. However, in some cases this is not desired, so the Database object also includes add and update methods, that will return errors if the operation is not appropriate for the provided Document object.

Fetching documents

Retrieving a document from the Database is done via the fetchDocument method. This method returns an optional Document object:

let database = Database.openedWith(Database.Configuration())
let myDocumentId = "my_document"

database.fetchDocument(myDocumentId) { fetchedDocument in
	if let document = fetcheDocument {
			print("Fetched document with id: \(document.metadData.id)")
	} else {
		print("No document with id: \(myDocumentId) exists")
	}
}

Multiple documents may be fetched in a single call using the fetchDocuments method. This method takes an array of document ids, and the completion handler is provided a [String: Document] dictionary where the dictionary key is a document id. If any strings in the input array are not associated with a Document, then those keys are not included in the output dictionary.

let database = Database.openedWith(Database.Configuration())

// a document with the id of "my_document" exists, but not "not_a_document_id".
let ids = ["my_document_id", "not_a_document_id"] 

database.fetchDocuments(ids) { fetchedDocuments in
	for (_, document) in fetchedDocuments {
		print("fetched document: \(document.metaData.id)")
	}
}

// Will print:
// fetched document: my_document_id

Deleting documents

Documents may be deleted from the database by calling the delete method. This method takes the id of the Document to be deleted, and the completion handler for this method will be passed an UpdateResult enumeration containing the id of the deleted document.

let database = Database.openedWith(Database.Configuration())

database.delete("my_document_id") { result in
	switch result {
		case .success(let deletedDocumentId):
			print("Deleted document with id: \(deletedDocumentId)")
		case .failed(let error)
			print("Error deleting document: \(error)")
	}
}

Indexes and Queries

Other than requesting a document by it's Id, the other way to retrieve data from the database is by creating an Index, and then running a query against that index. Using Index/Queries is an very fast and powerful way to interact with data from the database.

An array of indexes are configured as part of the OpenDatabaseConfiguration structure used to open the database. An index consists of at least three components: a String name, a String version, and mapping closure. Once created, a database index is stored as part of the data in database, therefore it's important when using indexes that the index array be included every time the database is opened.

Indexes adhere to the Indexer protocol. A basic implelmentation of this protocol is used in the Database.Index class. This class can be subclassed, or a developer can provide their own implementation of Indexer.

Name

The index name is a string that uniquely identifies the index and is used when running queries. It is a best practice to ensure this name is descriptive.

Version

The index version is a string that uniquely identifies the code used to map the index. Changes to an index require special handling. If any changes are made to the index closures, the version string should be changed to ensure the index is properly updated. Failure to change this value when updating the code will lead to unpredicable results.

Mapping

An index can be thought of as a kind of table or dictionary, where you have a key, and an optional value. The key is used during the query to filter the results, and the value is extra data that is easily accessed without needing to retrive the entire document from the database during the query execution. The job of the index's Map closure is to add rows or key/value pairs to this dictionary.

The map closure is defined as:

typealias Map = (_ document: Document, _ addIndexRow: @escaping  AddIndexRow) -> Void

and AddIndexRow is a closure and defined as:

typealias AddIndexRow = (_ key: Any, _ value: Any?) -> Void

So, in the map closure, the code receives a Document, and an AddIndexRow closure. The document is then used to determine what rows to add to the index, and those rows are added by calling addIndexRow which provides the index key and the optional value.

Example:


let map = { document, addIndexRow in

	if let totalCost = document["TotalCost"] {
		addIndexRow(totalCost, document["InvoiceNumber"]
	}
}

Breaking this down, you have this flow:

If the document contains an element called "TotalCost"
add a row to the index where the key is this total cost,
Associate the index value as the value of an element called "InvoiceNumber"

In this example system the developer could then run a query against this index, searching for a range of total costs and getting their invoice number. This query would be sorted by the TotalCost value, and accessing the InvoiceNumber would be extremely fast since it's part of the index data.

Running Queries

Queries are how Indexes are used. A query cannot be run without an index, an index without queries against it serves no purpose. In a query, the developer identifies the index keys they are interested in, and then running the query returns results matching those index keys. There are two primary models for queries:

Query by key

In the Query by Key style query, a list of explict keys are provided. The system will return the index rows that match these keys. The structure QueryByKeyList is used to create these type of queries:

let query= QueryByKeyList()
query.keys = ["red", "green", "blue", "purple"]

database.runQuery(on: "ColorIndex", with: query) { queryEnumerator in 
	print("Returned \(queryEnumerator.count) rows")
}

In the above example, we're assuming the database has an index called "ColorIndex" defined, where the key is the name of a color. This query will only return rows where the key is one of the four listed colors.

Query by range

A range query specifies a starting key, and an ending key. It will return all keys falling within this range, as sorted by the index. Sorting rules vary by the data type of the key, so strings would be sorted alphabetically, numbers sorted numerically, etc. Additionally, leaving a start key nil would indicate the query should start at the very first row of the index; a nil end key indicates the results should end at the last row of the index, thus providing a "less than" and "greater than" type query:


let query= QueryByKeyRange()
query.startKey = 1000

database.runQuery(on: "InvoiceCostIndex", with: query) { queryEnumerator in 
	print("Returned \(queryEnumerator.count) invoices with a total cost greater than 1000")
}

In the above example, we're assuming the database has an index called "InvoiceCostIndex" defined, where the key is a numerical value. This query will return all index rows where the value of the key is 1000 or greater.

Query Results

The query completion handler provides a QueryResultEnumerator object, which enumerates over a collection of QueryResultRow objects. The QueryResultRow has properties for the index key, the index value, and the document id of the document that generated the key/value pair in the index. For example:


let query= QueryByKeyRange()
query.startKey = 1000

database.runQuery(on: "InvoiceCostIndex", with: query) { queryEnumerator in 
	print("Returned \(queryEnumerator.count) invoices with a total cost greater than 1000")
	
	while let queryRow = queryEnumerator.next() {
		print("   Invoice: \(queryRow.value) - total cost: \(queryRow.key)")
	}
}

In the above example, building on previous examples, the system will print is invoice number (from the value of the index) and the total cost (from the key of the index) for all documents who's cost is 1000 or greater.

Replication

Data replication with a Predix Sync backend service can be a powerful tool for accessing Predix data while offline, and sharing data with other system users.

Configuration

Configuring replication is very straigtforward, supply the the URL of Predix Sync service to use, then there are two key options: repeating and bidirectional.

Repeating replication automatically detects changes, and replicates them as needed, until explicitly stopped or the application shuts down. This type of replication is good when you want to ensure all changes from one system are sent to another system as soon as possible. This is the behavoir if the ReplicationConfiguation object's repeating property is set to true. If the property is false, the replication will be non-repeating. In other words, a single exhange of data will occur, then the replication will be complete. For another exchange to occur, replication would have to be started again.

Bidirectional replication refers to the changes being sent from the Predix Sync server to the client, and from the client back to the Predix Sync server. This is the behavior if the ReplicationConfiguation object's bidirectional property is set to true. If the property is false data will only be replicated from the Predix Sync service to the client. No client changes will be sent to the server. This type of replication is good for read-only systems, or systems that want to receive changes immediately, but delay sending changes to the server.

Pre-configured options

These ReplicationConfiguration options are very easy to manage, as the ReplicationConfiguration object has static initializers to create common types of replication:

Creates a repeating, bidirectional replication configuration:

let replicationConfig = ReplicationConfiguration.repeatingBidirectionalReplication(with: myPredixSyncURL)

Creates a non-repeating, bidirectional replication:

let replicationConfig = ReplicationConfiguration.oneTimeBidirectionalReplication(with: myPredixSyncURL)

Creates a non-repeating, non-bidirectional replication:

let replicationConfig = ReplicationConfiguration.oneTimeServerToClientReplication(with: myPredixSyncURL)

Starting/Stopping

Starting and stopping replication is simple:

To start the replication exchange:

database.startReplication(with: replicationConfig)

To stop the replication exhange:

database.stopReplication()

Stopping a non-repeating replication is not necessary, as the replication will automatically stop when the data exchange has completed. However, stopReplication() can still be called on a non-repeating replication to cancel a long-running in-process replication.

All replication work happens in a background queue, so there may be a slight delay between calling these methods and the data exchange starting or ending.

ReplicationStatusDelegate

Replication uses a standard delegate pattern to provide information as to the current status of replication. The object associated with the replicationStatusDelegate property of the database will be called for these replication events:

replicationDidComplete
replicationIsSending
replicationIsReceiving
replicationFailed

Information in each of those events allows the developer to handle errors, update status UI, or know when a data exchange has completed.

Advanced Topics

Triggers

DatabaseChangeDelegate

A developer can establish a trigger to be informed of any database changes, by associating a databaseChangeDelegate with the database. This delegate is called for all database changes. The delegate will receive an array of DocumentChangedDetails, which includes the document id of the changed document, if the source of the change was replication or not, and if the document change was a deletion or not.

Documents

Id factory

Documents created without an id, are automatically given an id. This id is generated in a static closure on the Document class: idFactory. By default a document id will be generated from a UUID, however by replacing this static closure, a developer can use a custom-scheme to generate document ids.

date formatter

While the JSON format does not recognize a Data type, the Document object will recogize data that is a date, and automatically convert it to a Date type. By default the Document class will use the ISO8601 date format standard, however a custom DataFormatter can be replace this by assigning an object to the static dateFormatter property.

Database Configuration

Default

The default database configuration, returned by OpenDatabaseConfiguration.default uses "pm" as the default name, and a subdirectory under the Application Support directory for it's path. These defaults can be changed by using a subclass of OpenDatabaseConfiguration and overriding the defaultDatabaseName() and defaultLocation() methods. In this way, a developer could create several subclasses of OpenDatabaseConfiguration to support several defaults easily.

CompletionQueue

By default, all completion handlers for the asynrouc database methods will be called back on the main queue. If another queue is desired, a the OpenDatabaseConfiguration initializer includes a completionQueue parameter that can be used to provide a custom completion handler queue for the database.

Equality

Two Database.Configuration objects are equal if their database name, and file location are equal. All databases opened with equal configurations return the same database object.

Indexes and Queries

Map/Reduce

An optional capability to Indexes is provideing a Reduce function. This function defined as:

typealias Reduce = (_ keys: [Any], _ values: [Any], _ rereduce: Bool) -> (Any)

Allows map/reduce technique queries where the result rows of the query are summarized by the reduce function before the results are returned.

The reduce function takes an ordered list of key/value pairs. These are the the keys and values from the index, as specified by the query parameters. The reduce function then aggregates these results together into a single object, then returns that object.

Common use cases are to provide subtotals, or averages, or summations of data.

Rereduce

The rereduce flag is used when querying large data sets. When the data set is large enough, the underlying system will break the map/reduce into smaller chucks, run the reduce on each chunk, then run reduce again on the reduced chunks. When this happens, the rereduce flag will true, the key array will be empty, and the value array will contain the partial reduced values.

Example:

Given an index that emits the type string of each document, and no value.

 let reduce =  { ( keys, values, rereduce) in
 
     var result: [String: Int] = [:]
 
     // if this is not a rereduce
     if !rereduce {
         // count each unique key value
         if let sKeys = keys as? [String] {
             for key in sKeys {
                 var count = result[key] ?? 0
                 count += 1
                 result[key] = count
             }
         }
     } else {
         // This is a rereduce, then our value array will be an array of
         // dictionaries of unique key values and their counts from above.
         if let counts = values as? [[String: Int]] {
             // for each result array
             for count in counts {
                 // for each key in the result
                 for key in count.keys {
                     // count and compile a final result dictionary
                     var count = result[key] ?? 0
                     count += 1
                     result[key] = count
                 }
             }
         }
     }
 
     // Return the unique key values
     // Note that regardless of the rereduce flag the result is the same data type. 
     // This must always be the case.
     return result
 }

Sorting

Indexes are sorted according to their keys. For simple data types like strings and number types this order is obvious. However, using a key that is an array is particularly useful to achieve a grouped sorting. Elements are compared in order of their array index. So, all the first array elements are compaired, then all the second elements, etc.

Observing Queries

Queries can be run in the background, and a closure called when the query results change. This is known as "observing" the query. The database function observeQuery(on: with: changehandler:) is used to observe the query. The changeHandler parameter is a closure that will receive the QueryResultEnumerator whenever changes to the database cause the query results to update. This function returns a QueryObserver object. To stop observing the query, and clean up system resources use the database method removeObserver().

Replication

Replication has two ways of controlling what documents are replicated. On the client side, a filter can be established to prevent some documents from being sent from the client to the server. Additionally, channels can be used to prevent the server from sending documents to the client from the server.

Filters

To establish a filter, assign a replicationFilterDelegate to the database. This delegate will be called to evaluate each local document and return true if the document should be sync'd to the server. The delegate's method will receive the document being evaluted, and an optional dictionary of filterParameters. These filter parameters are set in the ReplicationConfiguration object that initiated the replication.

Channels

Channels are part of the Predix Sync service. In the metadata of all documents is an array of strings, the channels property. These are the channel names. On the server side, Predix Sync can be configured to limit the channels a user has access to. Additionally, the ReplicationConfiguration has a limitToChannels property. An empty limitToChannels property (the default) will result in the client receiving all docuemnts the user has access to. However, if channel names are added to the limitToChannels property only those documents that contain those channels will be sent from the server to the client. This doesn't override the server security settings, but rather can further reduce the documents the client receives for that replication configuration.

Wiki Home

Getting Started Guides

Features

How-To Guides:

API Documentation:

API Docs

PredixSDK Database Guide

PredixSDK Database Guide

Concepts

NoSQL Database

Async interaction

Replication

Indexes and Queries

General Database Topics

Database Getting Started

Opening a database

Default Configuration

Closing a database

Prepare for shutdown

Deleting a database

Documents

Creating documents

Interacting with document data

Metadata

Attachments

Subclassing

Document Database interaction

Saving documents

Adding and Updating vs Saving

Fetching documents

Deleting documents

Indexes and Queries

Name

Version

Mapping

Running Queries

Query by key

Query by range

Query Results

Replication

Configuration

Pre-configured options

Starting/Stopping

ReplicationStatusDelegate

Advanced Topics

Triggers

DatabaseChangeDelegate

Documents

Id factory

date formatter

Database Configuration

Default

CompletionQueue

Equality

Indexes and Queries

Map/Reduce

Rereduce

Sorting

Observing Queries

Replication

Filters

Channels

Clone this wiki locally