Skip to content

ZLEX redis server extension

tw-bert edited this page Apr 1, 2014 · 28 revisions

ZLEX can be used to query the native lexicographical index of redis Sorted Sets.

It can be used as a simple and efficient indexing mechanism in your Redis database, even when storing tens of millions of sorted set members with identical score. Possible use cases are: word completion, auto-suggest lists, pagination. ZLEX does no modifications on the write functionality of redis-server. Instead, it relies on the native storage mechanism that redis-server uses for Sorted Sets: a dictionary combined with a ziplist (small set) or a skiplist (large set). It can be used on redis master, redis slaves (read-only flag on or off), as well as from redis server-side Lua scripting. Therefore, it can be used directly from any redis client module that supports EVAL/EVALSHA/SCRIPT [LOAD|EXISTS] as soon as your server supports it.


Simple example - EX1

Data in Sorted Set - EX1

All sorted set entries below have a score of 0 and are stored in sorted set 'z'.

  • 0 anton
  • 0 fran
  • 0 frank
  • 0 frankie
  • 0 william

ZLEX the Sorted Set - EX1

ZLEX z 0 fra 0 2

  • fran
  • frank
  • frankie

Detailed description of ZLEX

  • syntax:
    • ZLEX key score asciistring start stop [BREAKMODE {[POSTALPHASTOP|NOBREAK]}] [WITHSCORES] [WITHHEADER]
  • info:
    • Return a range of elements using the lexicographical index.
  • description:
    • Return a range of ZSET elements, using the redis-native lexicographical (alphabetical) index on members of a sorted set, starting at score plus asciistring.
    • The sorted set does not have to contain an exact member asciistring, opposed to ZRANK and ZREVRANK commands.
    • The start parameter may be negative, to get a preroll of sorted set elements before the actual start (total number of results will never be more than stop - start + 1).
    • Use BREAKMODE POSTALPHASTOP if you want the query to stop (non inclusive) at a member that does not begin with asciistring. POSTALPHASTOP plus start 0 makes ZLEX work as a case-sensitive STARTSWITH query.
    • Use BREAKMODE NOBREAK if you want the query to continue even when encountering a different score (WITHSCORES makes sense in this mode).
    • WITHHEADER adds two header entries at the top of the returned list, if there are any results. The first is the RANK of the index entry in the sorted set, the second is the actual offset which might differ when a non-zero start gets out of bounds.
  • notes:
    • [#1] Due to the underlying data structure with possible skip list encoding, no collation or utf-8 interpretation is applied. If such functionality is required, collate the value passed to the ZADD command.
    • [#2] The match is raw-byte safe, including null characters.
    • [#3] A negative 'start' parameter (preroll) never causes an overflow past the start of (a score of, without NOBREAK) the sorted set. As a result, you might want to use the WITHHEADER option to determine which element is the actual offset to the start element.
    • [#4] Time complexity: O(log(N)+M) with N being the number of elements in the sorted set and M the number of elements being returned.

Real-life example - EX2; paginate with preroll

full screen

In this example, 'boris' could not be found. The focus will be on 'claire', since that's the next entry in the list.


Real-life example - EX3; auto-suggest lists

See the data in EX2. If the user enters a string b, the following ZLEX query is submitted:

ZLEX "addressbook:idx:firstname" 0 "b" 0 30 BREAKMODE POSTALPHASTOP

The indexed entries returned by ZLEX will be:

bernard
bob

The auto-suggest list presented to the user will be (for example, read below):

Bernard
Bob

This final data comes from "addressbook:data", if we want it cased and/or or uncollated. It is again collected by a Lua script. Note: if we want to trade some memory size in favor of speed, it's ofcourse also possible to denormalize this data, and store the cased and/or uncollated data in the idx ZSET member as well (delimited and/or serialized). Always make sure to be able to update the index entry if the master record changes. Advice: don't rely on ZLEX for updating your indices, but use more direct functions like ZRANK /ZREM / ZADD . This will point immediately to the required record. Therefore, store the lowercased/collated data in your master data as well (so you know exactly what index entries to update).


Some background info on Skiplists

A clear description on wikipedia here, which nicely mirrors Antirez' implementation below:

full screen

Clone this wiki locally