Skip to content

Writing Queries

dan-eyles edited this page Oct 27, 2013 · 34 revisions

Unless you've been living under a rock somewhere for the last couple of years, you have some idea of what a NoSQL database is. In the broad spectrum of NoSQL systems, SculeJS can most accurately be described as document-oriented in much the same way as MongoDB. That's really no coincidence since SculeJS was designed to be like MongoDB - hence the name: minuscule as opposed to humongous.

If you have some familiarity with the way queries are performed in MongoDB then you're already way ahead of the curve, however it probably isn't a bad idea to quickly read this document and familiarize yourself with some of the quirks inherent to the SculeJS way of doing things.

Table Of Contents

Performing queries against SculeJS collections is done using a combination of query expression and query conditions objects. Right now SculeJS supports the following operators:

  • $eq: a general equality operator similar to ==
{a:{$eq:1}}
{a:1}
  • $ne: a general inequality operator similar to !=
{a:{$ne:1}}
  • $gt: an arithmetic operator similar to >
{a:{$gt:1}}
  • $gte: an arithmetic operator similar to >=
{a:{$gte:1}}
  • $lt: an arithmetic operator similar to <
{a:{$lt:1}}
  • $lte: an arithmetic operator similar to <=
{a:{$lte:1}}
  • $in: a list equality operator that determines whether or not a value exists within a defined array of values
{a:{$in:[1, 2, 3, 4, 5, 6]}}
  • $nin: a list equality operator that determines whether or not a value does not exist within a defined array of values
{a:{$nin:[1, 2, 3, 4, 5, 6]}}
  • $all: a list equality operator that determines whether or not an array contains all of the values within a defined array of values
{a:{$all:[1, 2, 3, 4, 5, 6]}}
  • $size: matches an object, string, or array with the number of elements (length) specified by the argument
{a:{$size:10}}
  • $exists: a boolean operator that determines whether or not a field exists in a document
{a:{$exists:true}}
{a:{$exists:false}}
  • $near: a geo-locational proximity operator
{loc:{$near:{lat:30, lon:30, distance:1}}}
  • $within: a geo-locational boundary operator
{loc:{$within:{lat:30, lon:30, distance:1}}}
  • $elemMatch: matches all objects within an array and determines whether one or more match the provided query clauses
{arr:{$elemMatch:{a:3, b:{$gte:4}}}}
  • $where: matches objects using a closure bound to the scope of the current object in the collection. The provided function should return a boolean value indicating whether or not it should be added to the result set. $where expressions cannot be nested (just like $and and $or).
{$where:function() { return this.a == 3; }} // the same as {a:3} or {a:{$eq:3}}
  • $and: a logical AND operator
  • $or: a logical OR operator

Query expressions are built up using a collection of clauses consisting of logical expressions using the operators listed above.

You can also apply conditional operators to queries to affect sorting and limits. Conditions are supplied as a second expression object separate from the main query object:

  • $limit: analogous to the the SQL LIMIT keyword
collection.find({a:3}, {$limit:10}, function(result) {
   // result set will have a maximum of ten entries
});
  • $sort: analogous to the SQL SORT keyword. The sort operator is a little different to other operators in SculeJS - it actually precedes the operand it affects. Following are examples of how to apply sorts to queries:
collection.find({b:3}, {$sort:{a:-1}}, function(result) {
   // sorts results in descending order on the key 'a'
   // more or less the same as SELECT * FROM collection WHERE b = 3 ORDER BY a DESC
});
collection.find({b:3}, {$sort:{a:0}}, function(result) {
   // sorts results in random order on the key 'a'
   // more or less the same as SELECT * FROM collection WHERE b = 3 ORDER BY a RAND()
});
collection.find({b:3}, {$sort:{a:1}}, function(result) {
   // sorts results in ascending order on the key 'a'
   // more or less the same as SELECT * FROM collection WHERE b = 3 ORDER BY a ASC
});

Some example queries with their SQL equivalents are provided below:


SELECT * FROM test WHERE a = 3;
// the following two expressions are equivalent
{a:{$eq:3}}
{a:3}

SELECT * FROM test WHERE a = 3 AND b = 4;
// the following two expressions are equivalent
{a:{$eq:3}, b:{$eq:4}}
{a:3, b:4}

SELECT * FROM test WHERE a > 3;
{a:{$gt:3}}

SELECT * FROM test WHERE a > 3 AND b <= 4;
{a:{$gt:3}, b:{$lte:4}}

SELECT * FROM test WHERE a != 3 AND b >= 5
{a:{$ne:3}, b:{$gte:5}}

SELECT * FROM test WHERE a = 1 OR a = 2 OR a = 3 OR a = 4
{a:{$in:[1, 2, 3, 4]}}

SELECT * FROM test WHERE a != 1 OR a != 2
{a:{$nin:[1, 2]}}

SELECT * FROM test WHERE (a = 3 AND b < 4) OR (a > 5 AND b = 10)
{$or:[{a:3, b:{$lt:4}}, {a:{$gt:5}, b:10}]}

SELECT * FROM test WHERE a = 3 AND b = 3 AND (a > 10 OR b < 11)
{a:3, b:3, $or:[{a:{$gt:10}},{b:{$lt:11}}]}

All clauses in SculeJS query expression objects are logically AND-ed unless otherwise specified by an $or operator. There are also some caveats to using logical operators:

  • You can't nest an $or within an $or
  • You can't nest multiple $and expressions
  • Logical operators require at least two sub-expressions

In actual fact SculeJS will barf if it encounters a query expression with mutiple levels of nested logical operators. The following is an example of some queries that will raise exceptions:

{$or:[{a:3},{$or:[{a:10}, {b:11}]}]}
{$and:[{$and:[{b:1}, {a:3}]}, {$or:[{a:3}, {a:10}]}]}

You can retrieve objects from SculeJS collections using one of three methods:

find ( query[Object], conditions[Object], callback[Function] )

The find function provides the primary interface for performing complex queries against SculeJS collections. When called with well formed query expression the function will return an array containing all matching objects in the collection. You can also provide an optional callback function that will be executed with the result set as the sole argument once the query is completed.

The find function has the following syntax:

collection.find( <query>, <conditions>, <callback> );

If no arguments are provided the find function will return all objects in the collection. Let's take a look at a few examples...

Let's say you're building a web based game and need to store arbitrary user data in order to generate leader boards and display player positions on a map. You'll probably need to store the following set of attributes:

  • player name
  • nickname
  • date joined
  • current score
  • position

Your user object might look something like the following:

{
   name: 'Joe Player',
   nickname: 'cerealkiller',
   joined: 1357781961, // a unix timestamp
   score: 102034, 
   positions: {
      lat: 30,
      lon: 30
   }
}

You can see that we've represented the joined date/time as a unix timestamp, the score as an integer and the geographic position of player as an object with two attributes: lat the latitude of the coordinate pair, lon the longitude of the coordinate pair.

In this example we have 1000 player objects stored in a collection named players - this data might come from a web service end-point in your system, it doesn't really matter, for the purposes of this example we'll just assume it's already stored in a SculeJS collection. To get a list of the top ten users sorted by score descending we'd perform the following query:

var collection = Scule.factoryCollection('scule+local://players');
collection.find({}, {$limit:10, $sort:{score:-1}}, function(results) {
   // do something with results here
});

Easy huh? How about if you wanted to return the top scoring players within an arbitrary geographic distance around the current player?

// let's assume the users lat/lon co-ordinates are 30, 130
var collection = Scule.factoryCollection('scule+local://players');
collection.find({$within:{lat:30, lon:130, distance:1}}, {$limit:10, $sort:{score:-1}}, function(results) {
   // do something with results here
});

If you wanted a list of all users who had joined the game between two timestamps:

// let's assume the users lat/lon co-ordinates are 30, 130
var collection = Scule.factoryCollection('scule+local://players');
collection.find({joined:{$gt:1357781961, $lt:1357791961}}, {}, function(results) {
   // do something with results here
});

Finally, if you wanted to grab a user's record using their nickname you'd perform the following query:

var collection = Scule.factoryCollection('scule+local://players');
collection.find({nickname:'cerealkiller'}, {}, function(results) {
   // do something with results here
});

You can also perform counts on collections using the count function with the same arguments you would use for the find function. This roughly equivalent to using the COUNT keyword in an SQL statement. For example, if you wanted to found all users with scores greater than or equal to 1000 you would use the following query:

var collection = Scule.factoryCollection('scule+local://players');
collection.count({score:{$gte:1000}}, {}, function(count) {
   // do something with the count here
});

findAll ( callback[Function] )

The findAll function simply returns an array containing every object in the collection - you can think of it as being analogous to:

SELECT * FROM table

The only (optional) argument for the findAll function is a callback that accepts the results of the query as it's only argument. An example of using findAll is provided below:

var collection = Scule.factoryCollection('scule+dummy://test');
collection.findAll(function(results) {
   results.forEach(function(document) {
      // do something with document here
   });
});

Calling findAll without providing a callback will cause the function to return the results of the query directly to the calling code:

var collection = Scule.factoryCollection('scule+dummy://test');
var result = collection.findAll();
result.forEach(function(document) {
   // do something here
});

findOne ( id[String|ObjectId], callback[Function] )

The findOne function searches the collection for a single object identified by it's primary key value. The function accepts two arguments:

  • id - either a string or an ObjectId instance
  • callback - function that accepts the result of the query (a single object) as it's sole argument

An example of using findOne is provided below:

var collection = Scule.factoryCollection('scule+dummy://test');
collection.findOne('4af9f23d8ead0e1d32000000', function(document) {
   // do something with the document here
});

Calling findOne without providing a callback will cause the function to return the results of the query directly to the calling code:

var collection = Scule.factoryCollection('scule+dummy://test');
var document = collection.findOne('4af9f23d8ead0e1d32000000');

You can modify data in SculeJS collections using two functions:

Update operations modify existing objects in SculeJS collections in place. The update function is the primary interface used to perform updates, it accepts five arguments:

  • A query expression object (see the find function for details)
  • A set of update clauses
  • A condition expression object (see the find function for details)
  • A boolean flag indicating whether to perform an upsert or not - default is false
  • An optional callback function to execute once the operation is complete

Update expression objects are similar to query expression objects and are composed of clauses using the following operators:

  • $set: sets an attribute value - if upsert is true and the attribute name doesn't exist within the object it is created
var collection = Scule.factoryCollection('scule+local://test');
collection.update({b:10}, {$set:{a:5}}, {}, function(results) {
   // do something with list of modified objects
});
  • $unset: removes an attribute
var collection = Scule.factoryCollection('scule+local://test');
collection.update({b:10}, {$unset:{a:true}}, {}, function(results) {
   // do something with list of modified objects
});
  • $inc: increments an attribute value - if upsert is true and the attribute name doesn't exist within the target object it is created and the value is set to the increment amount
var collection = Scule.factoryCollection('scule+local://test');
collection.update({b:10}, {$inc:{i:3}}, {}, function(results) {
   // do something with list of modified objects
});
  • $push: appends a value to an existing array
var collection = Scule.factoryCollection('scule+local://test');
collection.update({b:10}, {$push:{a:'foo'}}, {}, function(results) {
   // do something with list of modified objects
});
  • $pushAll: appends a series of values to an existing array
var collection = Scule.factoryCollection('scule+local://test');
collection.update({b:10}, {$pushAll:{a:['foo', 'bar']}}, {}, function(results) {
   // do something with list of modified objects
});
  • $pull: removes all instances of a value from an existing array
var collection = Scule.factoryCollection('scule+local://test');
collection.update({b:10}, {$pull:{a:'foo'}}, {}, function(results) {
   // do something with list of modified objects
});
  • $pullAll: removes all instances of a series of values from an existing array
var collection = Scule.factoryCollection('scule+local://test');
collection.update({b:10}, {$pullAll:{a:['foo', 'bar']}}, {}, function(results) {
   // do something with list of modified objects
});
  • $pop: removes the last element of an existing array
var collection = Scule.factoryCollection('scule+local://test');
collection.update({b:10}, {$pop:{a:true}}, {}, function(results) {
   // do something with list of modified objects
});

Let's have a look at some examples:

/**
 * Updates objects in the collection where i >= 10, limiting updates to the first ten objects
 * matching the query criteria sorted by i in ascending order. This update will upsert the value
 * 'bar' for the attribute name 'foo' if it is missing from any objects appearing in the list to
 * be mutated.
 */
collection.update({i:{$gte:10}}, {$set:{foo:'bar'}}, {$limit:10, $sort:{i:1}}, true, function(results) { 
   // do something here
}); 
/**
 * Updates objects in the collection where j > 10 and j <= 1000, incrementing the value of i by 1
 */
collection.update({j:{$gt:10, $lte:1000}}, {$inc:{i:1}}, {}, false, function(results) {
   // do something here
});
/**
 * removes the attribute 'bar' from any objects where the value of attribute foo contains the term 'bar'
 */
var results = collection.update({foo:/bar/}, {$unset:{bar:true}});
/**
 * Updates all objects there the value of attribute foo equals 'bar'. This update will increment the value
 * of attribute i by 3, set the value of j to 10 and the value of a to an array containing 1, 2, 3. This 
 * update will also remove the attribute c from all matched objects.
 */
var results = collection.update({foo:'bar'}, {$inc:{i:3}, $set:{j:10, a:[1, 2, 3]}, $unset:{c:true}});

The remove function is the main interface used to remove objects from SculeJS collections. You can think of it as being analogous to the DELETE keyword in SQL. The function accepts three arguments:

  • A query expression object - see the find function for more information
  • A query condition object - see the find function for more information
  • An optional callback function - the sole argument for the callback is a list of the objects affected by the transaction

Let's look at some examples:

/**
 * Deletes the first 100 objects from the collection where a == 3, similar to:
 * DELETE FROM test WHERE a = 3 LIMIT 1000
 */
var collection = Scule.factoryCollection('scule+dummy://test');
collection.remove({a:3}, {$limit:100}, function(results) {
   // do something with removed objects here
});
/**
 * Deletes the first 100 objects from the collection where a > 3 and a <= 199,
 * sorted by the value of attribute 'b' in descending order.
 * Similar to:
 * DELETE FROM test WHERE a = 3 LIMIT 1000
 */
var collection = Scule.factoryCollection('scule+dummy://test');
collection.remove({a:{$gt:100, $lte:199}}, {$limit:100, $sort:{b:-1}}, function(results) {
   // do something with removed objects here
});
/**
 * Deletes all objects from the collection where a == 10,
 * similar to:
 * DELETE FROM test WHERE a = 10
 */
var collection = Scule.factoryCollection('scule+dummy://test');
collection.remove({a:10}, {}, function(results) {
   // do something with removed objects here
});
/**
 * Deletes the object with the corresponding _id,
 * similar to:
 * DELETE FROM test WHERE id = '4af9f23d8ead0e1d32000000'
 */
var collection = Scule.factoryCollection('scule+dummy://test');
collection.remove({_id:Scule.getObjectId('4af9f23d8ead0e1d32000000')}, {}, function(results) {
   // do something with removed object here
});

Map Reduce is a programming model for processing large data sets, and the name of an implementation of the model by Google. MapReduce is typically used to do distributed computing on clusters of computers.

The model is inspired by the map and reduce functions commonly used in functional programming, although their purpose in the MapReduce framework is not the same as their original forms.

You can perform map/reduce operations against SculeJS collections using the mapReduce function. The map function has two arguments:

  • a reference to the document to map
  • a reference to a function called "emit"

The reduce function should return the result of the reduce operation. The options object for the map/reduce operation can optionally contain:

  • an "out" value containing a collection connection url as either a "merge" or "reduce" sub-argument. If provided as a merge the provided collection will be merged with the results of the map/reduce operation.
  • a query expression object
  • a limit/sort expression object
  • a "finalize" value containing a reference to a function. Finalize functions should accept two arguments; a reference to the key being operated on and a reference to results of the reduce function. The finalize function should return a reference to the modified "reduced" value once it has completed execution.

An example options object might look like:

{
     out:{
         merge:'scule+dummy://map-reduce'
     },
     query:{a:10},
     conditions:{$limit:100, $sort:{a:-1}},
     finalize:function (key, reduced) {
         reduced.finalized = key;
         return reduced;
     }
}

The provided callback function is executed once the map/reduce operation is complete. The only argument for this function is the "out" collection. I've included an example of a simple map/reduce below:

Scule.dropAll();
var collection = Scule.factoryCollection('scule+dummy://unittest');
collection.ensureBTreeIndex('a.b', {order:100});
for(var i=0; i < 1000; i++) {
   var r = i%10;
   collection.save({
      a:{
         b:Scule.global.functions.randomFromTo(1, 10)
      },
      bar:'foo'+r,
      arr:[r, r+1, r+2, r+3],
      scl:i
   });
}
collection.mapReduce(
   function(document, emit) {
      emit(document.bar, {scl: document.scl});
   },
   function(key, reduce) {
      var o = {
         count: 0,
         total: 0,
         key: key
      };
      reduce.forEach(function(value) {
         o.count++;
         o.total += value.scl;
      });
      return o;
   },
   {
      out:{
         reduce:'scule+dummy://mapreduce'
      },
      finalize:function(key, reduced) {
         reduced.finalized = key;
         return reduced;
      }
   },
   function(out) {
      var o = out.findAll();
   }
);