Support Stream data structure #59

FZambia · 2022-05-31T08:26:24Z

Hello, came across Dragonfly on Hacker News. Very cool project – good luck with it!

I am very interested in Stream data structure support - think that's the only missing command for me to start experimenting with Dragonfly.

Specifically, in my use case I am using:

XADD
XRANGE
XREVRANGE

Hope this will be added at some point.

romange · 2022-05-31T12:48:12Z

Ok ok, I hear you load and clear. Your wish is my command.

Covers most funvtionality for #59.

romange · 2022-06-11T04:13:37Z

@FZambia you do not use XREAD? how do you pull from the stream?

FZambia · 2022-06-11T14:43:37Z

Yep, not using it 😀 In my use case I read stream content using two commands: XRANGE and XREVRANGE. The system is https://github.com/centrifugal/centrifugo. There is actually a combination of Redis PUB/SUB and Redis stream for message broadcasting.

But I suppose XREAD is definitely useful for most of other Redis users.

romange · 2022-06-11T15:16:34Z

So how do you flush it? Or you use maxlimit to control the capacity?

…

On Sat, Jun 11, 2022, 17:43 Alexander Emelin ***@***.***> wrote: Yep, not using it 😀 In my use case I read stream content using two commands: XRANGE and XREVRANGE. The system is https://github.com/centrifugal/centrifugo. There is actually a combination of Redis PUB/SUB and Redis stream for message broadcasting. But I suppose XREAD is definitely useful for most of other Redis users. — Reply to this email directly, view it on GitHub <#59 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA4BFCA2EERM3V5SN2Q3DY3VOSQ2HANCNFSM5XMWYDZA> . You are receiving this because you commented.Message ID: ***@***.***>

FZambia · 2022-06-11T15:18:06Z

So how do you flush it? Or you use maxlimit to control the capacity?

Yep, limiting size using MAXLEN option and having expiration time for the entire stream. I am using stream as a windowed log of messages.

romange · 2022-06-11T16:04:24Z

Very cool project, btw. https://centrifugal.dev/ beautiful site as well.

…

On Sat, Jun 11, 2022 at 6:18 PM Alexander Emelin ***@***.***> wrote: So how do you flush it? Or you use maxlimit to control the capacity? Yep, limiting size using MAXLEN option and having expiration time for the entire stream. — Reply to this email directly, view it on GitHub <#59 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA4BFCDBMWUXC4VAVLMVXM3VOSU3RANCNFSM5XMWYDZA> . You are receiving this because you commented.Message ID: ***@***.***>

1. add tests that cover xrange,xrevrange and various error states for xadd. 2. Implement 8 stream commands overall. Fixes #59. Signed-off-by: Roman Gershman <[email protected]>

romange · 2022-06-13T05:10:16Z

@FZambia please take a look at https://github.com/dragonflydb/dragonfly/releases/tag/v0.3.0-alpha

this should support everything you need in streams.

FZambia · 2022-06-13T05:35:58Z

Many thanks! I'll try and report results here.

FZambia · 2022-06-13T13:01:59Z

Experimented a bit!

Debian 11: 5.10.0-10-amd64 #1 SMP Debian 5.10.84-1 (2021-12-08) x86_64 GNU/Linux

In a VM with shared CPU (Intel Xeon Processor (Icelake)).

Redis version 6.0.16, Dragonflydb built from source from v0.3.0-alpha

Redis benchmark for PUBLISH op (this is without STREAM - just a pipelined PUBLISH op):

BenchmarkRedisPublish_1Ch-2           	  313024	      3251 ns/op

Dragonflydb (./dragonfly --alsologtostderr):

BenchmarkRedisPublish_1Ch-2           	  138049	     12593 ns/op

Latency is much higher in dragonflydb case. And throughput seems to be also less correspondingly (I am using pipelining over single connection – so latency affects throughput). I looked at CPU usage on a broker side during a benchmark – and it actually seems lower on dragonfly (2 times less than in Redis case). As I mentioned this is on 2-CPU VM, so maybe the real gain should come on multicore machines? Maybe the latency caused by some internal batching in dragonfly which collects commands for calling io-uring API?

Now for checking streams. Unfortunately I was not able to make it work, I am using Lua and in Dragonflydb case I am getting an error NOSCRIPT No matching script. Please use EVAL.

I was not able to reproduce this using redis-cli - LOAD'ed scripts are then executed fine with EVALSHA. Since MONITOR not available - can't quickly look which commands are sent to reproduce with redis-cli. I am using pipelining in bench - not sure can this be important or not (tried to use pipelining in console with simple script example – but also works fine..).

One interesting thing BTW, managed to put dragonfly into busy loop (100% CPU all the time, not responding to commands anymore) with this:

redis-cli
SCRIPT LOAD "return ARGV[1]"
EVALSHA 098e0f0d1448c0a81dafe820f66d460eb09263da 2 s d arg1

Result is never returned, Redis worked fine with this.

romange · 2022-06-13T13:10:43Z

Thanks for letting me know! Is it possible to reproduce BenchmarkRedisPublish_1Ch-2 ?

Is it possible to reproduce the streaming case?

re latency and throughput - yes, the internal latency avg latency in DF may be higher than in Redis due to message passing between threads.

re - monitor - you can run dragonfly with ./dragonfly --vmodule=main_service=2 and get all the commands in the log.
it's stored as /tmp/dragonfly.INFO

romange · 2022-06-13T13:13:37Z

In addition - see #113

apparently, Debian 11 has a performance problem with DF but I am not sure it applies in this case.

FZambia · 2022-06-13T14:33:48Z

re - monitor - you can run dragonfly with ./dragonfly --vmodule=main_service=2

Thx, this was useful

Thanks for letting me know! Is it possible to reproduce BenchmarkRedisPublish_1Ch-2 ?

You need Go language installed:

git clone https://github.com/centrifugal/centrifuge.git
cd centrifuge
go test -run xxx -benchmem -tags integration -bench BenchmarkRedisPublish_1Ch -benchtime 1s

Is it possible to reproduce the streaming case?

Think I found the reason. In Redis:

127.0.0.1:6379> SCRIPT LOAD "\nlocal epoch\nif redis.call(\'exists\', KEYS[2]) ~= 0 then\n  epoch = redis.call(\"hget\", KEYS[2], \"e\")\nend\nif epoch == false or epoch == nil then\n  epoch = ARGV[6]\n  redis.call(\"hset\", KEYS[2], \"e\", epoch)\nend\nlocal offset = redis.call(\"hincrby\", KEYS[2], \"s\", 1)\nif ARGV[5] ~= \'0\' then\n\tredis.call(\"expire\", KEYS[2], ARGV[5])\nend\nredis.call(\"xadd\", KEYS[1], \"MAXLEN\", ARGV[2], offset, \"d\", ARGV[1])\nredis.call(\"expire\", KEYS[1], ARGV[3])\nif ARGV[4] ~= \'\' then\n\tlocal payload = \"__\" .. \"p1:\" .. offset .. \":\" .. epoch .. \"__\" .. ARGV[1]\n\tredis.call(\"publish\", ARGV[4], payload)\nend\nreturn {offset, epoch}\n\t"
"5707131deda7789195310ee90da9fab71faf2e68"

In Dragonfly:

SCRIPT LOAD "\nlocal epoch\nif redis.call(\'exists\', KEYS[2]) ~= 0 then\n  epoch = redis.call(\"hget\", KEYS[2], \"e\")\nend\nif epoch == false or epoch == nil then\n  epoch = ARGV[6]\n  redis.call(\"hset\", KEYS[2], \"e\", epoch)\nend\nlocal offset = redis.call(\"hincrby\", KEYS[2], \"s\", 1)\nif ARGV[5] ~= \'0\' then\n\tredis.call(\"expire\", KEYS[2], ARGV[5])\nend\nredis.call(\"xadd\", KEYS[1], \"MAXLEN\", ARGV[2], offset, \"d\", ARGV[1])\nredis.call(\"expire\", KEYS[1], ARGV[3])\nif ARGV[4] ~= \'\' then\n\tlocal payload = \"__\" .. \"p1:\" .. offset .. \":\" .. epoch .. \"__\" .. ARGV[1]\n\tredis.call(\"publish\", ARGV[4], payload)\nend\nreturn {offset, epoch}\n\t"
"002c257c910c6033e88c9280f4fe9bb08fa3b131"

I.e. different SHA-sums. I am using Redigo client - https://github.com/gomodule/redigo - it calculates script sha sum on client side. It matches the one from Redis, but as you can see Dragonfly hashes to a different sum for some reason.

One more thing, even if I try to execute a script above with hash sum returned by Dragonfly I still get hanging and 100% CPU, i.e.:

EVALSHA 002c257c910c6033e88c9280f4fe9bb08fa3b131 2 x y 1 2 3 4 5 6

In Redis the same script works with the same EVALSHA.

romange · 2022-06-13T15:29:25Z

Ok, it's me being a smartass: https://github.com/dragonflydb/dragonfly/blob/main/src/server/main_service.cc#L705

I will follow up on your feedback, thank you very much, Alexander!

romange · 2022-06-15T15:17:37Z

@FZambia I checked the loadtest you provided. I can confirm that with such setup Redis will provide more throughput than Dragonfly. The reason for this is that DF uses message passing between threads (similar to go channels). Therefore a single PUBLISH request will incur 10-20us server-side latency compared to 1-2us for Redis. However, if you use multiple connections with DF, things will change dramatically. DF connections are asynchronous, meaning that even one the first one waits for an answer from the message bus, others will still progress. With dozens of incoming connections the latency factor will stop being important and DF will provide much better throughput with enough CPU power.

FZambia · 2022-06-15T19:25:04Z

@romange many thanks for the explanation! Actually it sounds good – in practice many server nodes connect to Redis, having a higher throughput is more important. Will be experimenting with different cases in the future.

Also, thanks for fixing LUA issues. Hope to try this all again very soon.

FZambia · 2022-06-15T21:28:36Z

Failed 😀

SCRIPT LOAD "\nlocal epoch\nif redis.call(\'exists\', KEYS[2]) ~= 0 then\n  epoch = redis.call(\"hget\", KEYS[2], \"e\")\nend\nif epoch == false or epoch == nil then\n  epoch = ARGV[6]\n  redis.call(\"hset\", KEYS[2], \"e\", epoch)\nend\nlocal offset = redis.call(\"hincrby\", KEYS[2], \"s\", 1)\nif ARGV[5] ~= \'0\' then\n\tredis.call(\"expire\", KEYS[2], ARGV[5])\nend\nredis.call(\"xadd\", KEYS[1], \"MAXLEN\", ARGV[2], offset, \"d\", ARGV[1])\nredis.call(\"expire\", KEYS[1], ARGV[3])\nif ARGV[4] ~= \'\' then\n\tlocal payload = \"__\" .. \"p1:\" .. offset .. \":\" .. epoch .. \"__\" .. ARGV[1]\n\tredis.call(\"publish\", ARGV[4], payload)\nend\nreturn {offset, epoch}\n\t"

EVALSHA 5707131deda7789195310ee90da9fab71faf2e68 2 x y 1 2 3 4 5 6

F20220615 21:27:01.492447 18153 transaction.cc:1201] TBD: Not supported
*** Check failure stack trace: ***
    @     0x5653be2bed3a  google::LogMessage::Fail()
    @     0x5653be2c51a7  google::LogMessage::SendToLog()
    @     0x5653be2be72d  google::LogMessage::Flush()
    @     0x5653be2bff59  google::LogMessageFatal::~LogMessageFatal()
    @     0x5653be1a666a  dfly::DetermineKeys()
    @     0x5653be148637  dfly::Service::DispatchCommand()
    @     0x5653be1501b6  _ZNSt17_Function_handlerIFvN4absl12lts_202111024SpanINS2_IcEEEEPN4dfly14ObjectExplorerEEZNS5_7Service12EvalInternalERKNS9_8EvalArgsEPNS5_11InterpreterEPNS5_17ConnectionContextEEUlS4_S7_E_E9_M_invokeERKSt9_Any_dataOS4_OS7_.lto_priv.0
    @     0x5653be201e60  dfly::Interpreter::RedisGenericCommand()
    @     0x5653be276ea1  luaD_precall
    @     0x5653be284ec8  luaV_execute
    @     0x5653be277210  luaD_callnoyield
    @     0x5653be2761ca  luaD_rawrunprotected
    @     0x5653be277560  luaD_pcall
    @     0x5653be273f3a  lua_pcallk
    @     0x5653be146e07  dfly::Service::EvalInternal()
    @     0x5653be14b304  dfly::Service::EvalSha()
    @     0x5653be147f4f  dfly::Service::DispatchCommand()
    @     0x5653be209862  facade::Connection::ParseRedis()
    @     0x5653be20c902  facade::Connection::HandleRequests()
    @     0x5653be22f7e4  util::ListenerInterface::RunSingleConnection()
    @     0x5653be22fa92  _ZN5boost6fibers14worker_contextIZN4util17ListenerInterface13RunAcceptLoopEvEUlvE0_JEE4run_EONS_7context5fiberE
    @     0x5653be22c300  _ZN5boost7context6detail11fiber_entryINS1_12fiber_recordINS0_5fiberENS0_21basic_fixedsize_stackINS0_12stack_traitsEEESt5_BindIFMNS_6fibers14worker_contextIZN4util17ListenerInterface13RunAcceptLoopEvEUlvE0_JEEEFS4_OS4_EPSE_St12_PlaceholderILi1EEEEEEEEvNS1_10transfer_tE
    @     0x7faae91ce19f  make_fcontext
*** SIGABRT received at time=1655328421 on cpu 0 ***
PC: @     0x7faae8c61ce1  (unknown)  raise
    @ ... and at least 1 more frames
Aborted

romange · 2023-11-14T07:00:37Z

@FZambia I noticed you are planning to write a blog post.

Please let us know if we can assist you with tuning Dragonfly for centrifugal workloads. Additionally, if you come across any performance bottlenecks, please inform us. We're here to help!

FZambia · 2023-11-20T05:37:16Z

@romange hello, thx for this! Looks like I'll need some help soon. For now stepped away from a blog post because I got a bit strange results which I could not quickly explain - but hope to come back to it soon.

First results were generally slower than Redis - but this could be due to benchmark nature. I am extensively using pipelining already, through several connections. Using https://github.com/redis/rueidis library. I am not sure whether this is expected for DF or not? I can provide full script to run my bench.

Another thing is that sometimes performance of DF was dropping from thousands to sth like 200 requests per second - I am not sure why, Redis was super-stable with the same bench.

I was testing with Ubuntu 23.10, think also tried Debian 12 in one run (DigitalOcean CPU-optimized droplets).

My main hope for DF in the perspective of Centrifugo projects is possibility to vertically scale PUB/SUB – with Redis and its sharded PUB/SUB in cluster it's possible, but the model is not very suitable for Centrifugo.

Because Centrifugo should deal with millions of topics and use only several dedicated connections for managing subscriptions. In sharded PUB/SUB case this means that only pre-sharding technique may help to achieve that. I.e. limiting slot space more, to some known value like 32, and create dedicated connection for each such slot - so they land to some nodes of a cluster. Which works but requries more care, planning and may result into unequal load on Redis cluster nodes due to the small number of such shards. And changes approach to creating keys. So an option to get more performance with single Redis instance is nice to have.

romange added a commit that referenced this issue Jun 8, 2022

Implement basic XADD, XRANGE, XLEN commands.

a9c0fa8

Covers most funvtionality for #59.

romange closed this as completed in d5cd317 Jun 12, 2022

This was referenced Jun 13, 2022

bug: inconsistent sha1 with redis #146

Closed

bug: infinite loop when calling LUA #147

Closed

romange mentioned this issue Jun 15, 2022

chore: loadtest centrifuge/df #149

Closed

romange mentioned this issue Jun 16, 2022

fix(lua): Fix few lua-related bugs #157

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Stream data structure #59

Support Stream data structure #59

FZambia commented May 31, 2022

romange commented May 31, 2022

romange commented Jun 11, 2022

FZambia commented Jun 11, 2022

romange commented Jun 11, 2022 via email

FZambia commented Jun 11, 2022 •

edited

Loading

romange commented Jun 11, 2022 via email

romange commented Jun 13, 2022

FZambia commented Jun 13, 2022

FZambia commented Jun 13, 2022 •

edited

Loading

romange commented Jun 13, 2022

romange commented Jun 13, 2022

FZambia commented Jun 13, 2022

romange commented Jun 13, 2022

romange commented Jun 15, 2022 •

edited

Loading

FZambia commented Jun 15, 2022 •

edited

Loading

FZambia commented Jun 15, 2022

romange commented Nov 14, 2023 •

edited

Loading

FZambia commented Nov 20, 2023 •

edited

Loading

Support Stream data structure #59

Support Stream data structure #59

Comments

FZambia commented May 31, 2022

romange commented May 31, 2022

romange commented Jun 11, 2022

FZambia commented Jun 11, 2022

romange commented Jun 11, 2022 via email

FZambia commented Jun 11, 2022 • edited Loading

romange commented Jun 11, 2022 via email

romange commented Jun 13, 2022

FZambia commented Jun 13, 2022

FZambia commented Jun 13, 2022 • edited Loading

romange commented Jun 13, 2022

romange commented Jun 13, 2022

FZambia commented Jun 13, 2022

romange commented Jun 13, 2022

romange commented Jun 15, 2022 • edited Loading

FZambia commented Jun 15, 2022 • edited Loading

FZambia commented Jun 15, 2022

romange commented Nov 14, 2023 • edited Loading

FZambia commented Nov 20, 2023 • edited Loading

FZambia commented Jun 11, 2022 •

edited

Loading

FZambia commented Jun 13, 2022 •

edited

Loading

romange commented Jun 15, 2022 •

edited

Loading

FZambia commented Jun 15, 2022 •

edited

Loading

romange commented Nov 14, 2023 •

edited

Loading

FZambia commented Nov 20, 2023 •

edited

Loading