-
Notifications
You must be signed in to change notification settings - Fork 0
ZLEX redis server extension
ZLEX
can be used to query the native lexicographical index of redis Sorted Sets.
It can be used as a simple and efficient indexing mechanism in your Redis database, even when storing tens of millions of sorted set members with identical score. Possible use cases are: word completion, auto-suggest lists, pagination. ZLEX does no modifications on the write functionality of redis-server. Instead, it relies on the native storage mechanism that redis-server uses for Sorted Sets: a dictionary combined with a ziplist (small set) or a skiplist (large set). It can be used on redis master, redis slaves (read-only flag on or off), as well as from redis server-side Lua scripting. Therefore, it can be used directly from any redis client module that supports EVAL
/EVALSHA
/SCRIPT [LOAD|EXISTS]
as soon as your server supports it.
Simple example - EX1
Data in Sorted Set - EX1
All sorted set entries below have a score of 0 and are stored in sorted set 'z'.
- 0 anton
- 0 fran
- 0 frank
- 0 frankie
- 0 william
ZLEX the Sorted Set - EX1
ZLEX z 0 fra 0 2
- fran
- frank
- frankie
Detailed description of ZLEX
- syntax:
ZLEX key score asciistring start stop [BREAKMODE {[POSTALPHASTOP|NOBREAK]}] [WITHSCORES] [WITHHEADER]
- info:
- Return a range of elements using the lexicographical index.
- description:
- Return a range of
ZSET
elements, using the redis-native lexicographical (alphabetical) index on members of a sorted set, starting atscore
plusasciistring
. - The sorted set does not have to contain an exact member
asciistring
, opposed toZRANK
andZREVRANK
commands. - The
start
parameter may be negative, to get a preroll of sorted set elements before the actual start (total number of results will never be more thanstop
-start
+ 1). - Use
BREAKMODE POSTALPHASTOP
if you want the query to stop (non inclusive) at a member that does not begin withasciistring
.POSTALPHASTOP
plusstart 0
makesZLEX
work as a case-sensitive STARTSWITH query. - Use
BREAKMODE NOBREAK
if you want the query to continue even when encountering a different score (WITHSCORES
makes sense in this mode). -
WITHHEADER
adds two header entries at the top of the returned list, if there are any results. The first is the RANK of the index entry in the sorted set, the second is the actual offset which might differ when a non-zerostart
gets out of bounds.
- Return a range of
- notes:
-
[#1]
Due to the underlying data structure with possible skip list encoding, no collation or utf-8 interpretation is applied. If such functionality is required, collate the value passed to the ZADD command. -
[#2]
The match is raw-byte safe, including null characters. -
[#3]
A negative 'start' parameter (preroll) never causes an overflow past the start of (a score of, withoutNOBREAK
) the sorted set. As a result, you might want to use theWITHHEADER
option to determine which element is the actual offset to the start element. -
[#4]
Time complexity: O(log(N)+M) with N being the number of elements in the sorted set and M the number of elements being returned.
-
Real-life example - EX2; paginate with preroll
In this example, 'boris' could not be found. The focus will be on 'claire', since that's the next entry in the list.
Real-life example - EX3; auto-suggest lists
See the data in EX2. If the user enters a string b
, the following ZLEX query is submitted:
ZLEX "addressbook:idx:firstname" 0 "b" 0 30 BREAKMODE POSTALPHASTOP
The indexed entries returned by ZLEX will be:
bernard
bob
The auto-suggest list presented to the user will be (for example, read below):
Bernard
Bob
This final data comes from "addressbook:data", if we want it cased and/or or uncollated. It is again collected by a Lua script. Note: if we want to trade some memory size in favor of speed, it's ofcourse also possible to denormalize this data, and store the cased and/or uncollated data in the idx ZSET member as well (delimited and/or serialized). Always make sure to be able to update the index entry if the master record changes. Advice: don't rely on ZLEX
for updating your indices, but use more direct functions like ZRANK
/ZREM
/ ZADD
. This will point immediately to the required record. Therefore, store the lowercased/collated data in your master data as well (so you know exactly what index entries to update).
Some background info on Skiplists
A clear description on wikipedia here, which nicely mirrors Antirez' implementation below: