Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Stream data structure #59

Closed
FZambia opened this issue May 31, 2022 · 18 comments · Fixed by #157
Closed

Support Stream data structure #59

FZambia opened this issue May 31, 2022 · 18 comments · Fixed by #157

Comments

@FZambia
Copy link

FZambia commented May 31, 2022

Hello, came across Dragonfly on Hacker News. Very cool project – good luck with it!

I am very interested in Stream data structure support - think that's the only missing command for me to start experimenting with Dragonfly.

Specifically, in my use case I am using:

  • XADD
  • XRANGE
  • XREVRANGE

Hope this will be added at some point.

@romange
Copy link
Collaborator

romange commented May 31, 2022

Ok ok, I hear you load and clear. Your wish is my command.

romange added a commit that referenced this issue Jun 8, 2022
Covers most funvtionality for #59.
@romange
Copy link
Collaborator

romange commented Jun 11, 2022

@FZambia you do not use XREAD? how do you pull from the stream?

@FZambia
Copy link
Author

FZambia commented Jun 11, 2022

Yep, not using it 😀 In my use case I read stream content using two commands: XRANGE and XREVRANGE. The system is https://github.com/centrifugal/centrifugo. There is actually a combination of Redis PUB/SUB and Redis stream for message broadcasting.

But I suppose XREAD is definitely useful for most of other Redis users.

@romange
Copy link
Collaborator

romange commented Jun 11, 2022 via email

@FZambia
Copy link
Author

FZambia commented Jun 11, 2022

So how do you flush it? Or you use maxlimit to control the capacity?

Yep, limiting size using MAXLEN option and having expiration time for the entire stream. I am using stream as a windowed log of messages.

@romange
Copy link
Collaborator

romange commented Jun 11, 2022 via email

romange added a commit that referenced this issue Jun 12, 2022
1. add tests that cover xrange,xrevrange and various error states for xadd.
2. Implement 8 stream commands overall. Fixes #59.

Signed-off-by: Roman Gershman <[email protected]>
@romange
Copy link
Collaborator

romange commented Jun 13, 2022

@FZambia please take a look at https://github.com/dragonflydb/dragonfly/releases/tag/v0.3.0-alpha

this should support everything you need in streams.

@FZambia
Copy link
Author

FZambia commented Jun 13, 2022

Many thanks! I'll try and report results here.

@FZambia
Copy link
Author

FZambia commented Jun 13, 2022

Experimented a bit!

Debian 11: 5.10.0-10-amd64 #1 SMP Debian 5.10.84-1 (2021-12-08) x86_64 GNU/Linux

In a VM with shared CPU (Intel Xeon Processor (Icelake)).

Redis version 6.0.16, Dragonflydb built from source from v0.3.0-alpha

Redis benchmark for PUBLISH op (this is without STREAM - just a pipelined PUBLISH op):

BenchmarkRedisPublish_1Ch-2           	  313024	      3251 ns/op

Dragonflydb (./dragonfly --alsologtostderr):

BenchmarkRedisPublish_1Ch-2           	  138049	     12593 ns/op

Latency is much higher in dragonflydb case. And throughput seems to be also less correspondingly (I am using pipelining over single connection – so latency affects throughput). I looked at CPU usage on a broker side during a benchmark – and it actually seems lower on dragonfly (2 times less than in Redis case). As I mentioned this is on 2-CPU VM, so maybe the real gain should come on multicore machines? Maybe the latency caused by some internal batching in dragonfly which collects commands for calling io-uring API?

Now for checking streams. Unfortunately I was not able to make it work, I am using Lua and in Dragonflydb case I am getting an error NOSCRIPT No matching script. Please use EVAL.

I was not able to reproduce this using redis-cli - LOAD'ed scripts are then executed fine with EVALSHA. Since MONITOR not available - can't quickly look which commands are sent to reproduce with redis-cli. I am using pipelining in bench - not sure can this be important or not (tried to use pipelining in console with simple script example – but also works fine..).

One interesting thing BTW, managed to put dragonfly into busy loop (100% CPU all the time, not responding to commands anymore) with this:

redis-cli
SCRIPT LOAD "return ARGV[1]"
EVALSHA 098e0f0d1448c0a81dafe820f66d460eb09263da 2 s d arg1

Result is never returned, Redis worked fine with this.

@romange
Copy link
Collaborator

romange commented Jun 13, 2022

Thanks for letting me know! Is it possible to reproduce BenchmarkRedisPublish_1Ch-2 ?

Is it possible to reproduce the streaming case?

re latency and throughput - yes, the internal latency avg latency in DF may be higher than in Redis due to message passing between threads.

re - monitor - you can run dragonfly with ./dragonfly --vmodule=main_service=2 and get all the commands in the log.
it's stored as /tmp/dragonfly.INFO

@romange
Copy link
Collaborator

romange commented Jun 13, 2022

In addition - see #113

apparently, Debian 11 has a performance problem with DF but I am not sure it applies in this case.

@FZambia
Copy link
Author

FZambia commented Jun 13, 2022

re - monitor - you can run dragonfly with ./dragonfly --vmodule=main_service=2

Thx, this was useful

Thanks for letting me know! Is it possible to reproduce BenchmarkRedisPublish_1Ch-2 ?

You need Go language installed:

git clone https://github.com/centrifugal/centrifuge.git
cd centrifuge
go test -run xxx -benchmem -tags integration -bench BenchmarkRedisPublish_1Ch -benchtime 1s

Is it possible to reproduce the streaming case?

Think I found the reason. In Redis:

127.0.0.1:6379> SCRIPT LOAD "\nlocal epoch\nif redis.call(\'exists\', KEYS[2]) ~= 0 then\n  epoch = redis.call(\"hget\", KEYS[2], \"e\")\nend\nif epoch == false or epoch == nil then\n  epoch = ARGV[6]\n  redis.call(\"hset\", KEYS[2], \"e\", epoch)\nend\nlocal offset = redis.call(\"hincrby\", KEYS[2], \"s\", 1)\nif ARGV[5] ~= \'0\' then\n\tredis.call(\"expire\", KEYS[2], ARGV[5])\nend\nredis.call(\"xadd\", KEYS[1], \"MAXLEN\", ARGV[2], offset, \"d\", ARGV[1])\nredis.call(\"expire\", KEYS[1], ARGV[3])\nif ARGV[4] ~= \'\' then\n\tlocal payload = \"__\" .. \"p1:\" .. offset .. \":\" .. epoch .. \"__\" .. ARGV[1]\n\tredis.call(\"publish\", ARGV[4], payload)\nend\nreturn {offset, epoch}\n\t"
"5707131deda7789195310ee90da9fab71faf2e68"

In Dragonfly:

SCRIPT LOAD "\nlocal epoch\nif redis.call(\'exists\', KEYS[2]) ~= 0 then\n  epoch = redis.call(\"hget\", KEYS[2], \"e\")\nend\nif epoch == false or epoch == nil then\n  epoch = ARGV[6]\n  redis.call(\"hset\", KEYS[2], \"e\", epoch)\nend\nlocal offset = redis.call(\"hincrby\", KEYS[2], \"s\", 1)\nif ARGV[5] ~= \'0\' then\n\tredis.call(\"expire\", KEYS[2], ARGV[5])\nend\nredis.call(\"xadd\", KEYS[1], \"MAXLEN\", ARGV[2], offset, \"d\", ARGV[1])\nredis.call(\"expire\", KEYS[1], ARGV[3])\nif ARGV[4] ~= \'\' then\n\tlocal payload = \"__\" .. \"p1:\" .. offset .. \":\" .. epoch .. \"__\" .. ARGV[1]\n\tredis.call(\"publish\", ARGV[4], payload)\nend\nreturn {offset, epoch}\n\t"
"002c257c910c6033e88c9280f4fe9bb08fa3b131"

I.e. different SHA-sums. I am using Redigo client - https://github.com/gomodule/redigo - it calculates script sha sum on client side. It matches the one from Redis, but as you can see Dragonfly hashes to a different sum for some reason.

One more thing, even if I try to execute a script above with hash sum returned by Dragonfly I still get hanging and 100% CPU, i.e.:

EVALSHA 002c257c910c6033e88c9280f4fe9bb08fa3b131 2 x y 1 2 3 4 5 6

In Redis the same script works with the same EVALSHA.

@romange
Copy link
Collaborator

romange commented Jun 13, 2022

Ok, it's me being a smartass: https://github.com/dragonflydb/dragonfly/blob/main/src/server/main_service.cc#L705

I will follow up on your feedback, thank you very much, Alexander!

@romange
Copy link
Collaborator

romange commented Jun 15, 2022

@FZambia I checked the loadtest you provided. I can confirm that with such setup Redis will provide more throughput than Dragonfly. The reason for this is that DF uses message passing between threads (similar to go channels). Therefore a single PUBLISH request will incur 10-20us server-side latency compared to 1-2us for Redis. However, if you use multiple connections with DF, things will change dramatically. DF connections are asynchronous, meaning that even one the first one waits for an answer from the message bus, others will still progress. With dozens of incoming connections the latency factor will stop being important and DF will provide much better throughput with enough CPU power.

@FZambia
Copy link
Author

FZambia commented Jun 15, 2022

@romange many thanks for the explanation! Actually it sounds good – in practice many server nodes connect to Redis, having a higher throughput is more important. Will be experimenting with different cases in the future.

Also, thanks for fixing LUA issues. Hope to try this all again very soon.

@FZambia
Copy link
Author

FZambia commented Jun 15, 2022

Failed 😀

SCRIPT LOAD "\nlocal epoch\nif redis.call(\'exists\', KEYS[2]) ~= 0 then\n  epoch = redis.call(\"hget\", KEYS[2], \"e\")\nend\nif epoch == false or epoch == nil then\n  epoch = ARGV[6]\n  redis.call(\"hset\", KEYS[2], \"e\", epoch)\nend\nlocal offset = redis.call(\"hincrby\", KEYS[2], \"s\", 1)\nif ARGV[5] ~= \'0\' then\n\tredis.call(\"expire\", KEYS[2], ARGV[5])\nend\nredis.call(\"xadd\", KEYS[1], \"MAXLEN\", ARGV[2], offset, \"d\", ARGV[1])\nredis.call(\"expire\", KEYS[1], ARGV[3])\nif ARGV[4] ~= \'\' then\n\tlocal payload = \"__\" .. \"p1:\" .. offset .. \":\" .. epoch .. \"__\" .. ARGV[1]\n\tredis.call(\"publish\", ARGV[4], payload)\nend\nreturn {offset, epoch}\n\t"

EVALSHA 5707131deda7789195310ee90da9fab71faf2e68 2 x y 1 2 3 4 5 6
F20220615 21:27:01.492447 18153 transaction.cc:1201] TBD: Not supported
*** Check failure stack trace: ***
    @     0x5653be2bed3a  google::LogMessage::Fail()
    @     0x5653be2c51a7  google::LogMessage::SendToLog()
    @     0x5653be2be72d  google::LogMessage::Flush()
    @     0x5653be2bff59  google::LogMessageFatal::~LogMessageFatal()
    @     0x5653be1a666a  dfly::DetermineKeys()
    @     0x5653be148637  dfly::Service::DispatchCommand()
    @     0x5653be1501b6  _ZNSt17_Function_handlerIFvN4absl12lts_202111024SpanINS2_IcEEEEPN4dfly14ObjectExplorerEEZNS5_7Service12EvalInternalERKNS9_8EvalArgsEPNS5_11InterpreterEPNS5_17ConnectionContextEEUlS4_S7_E_E9_M_invokeERKSt9_Any_dataOS4_OS7_.lto_priv.0
    @     0x5653be201e60  dfly::Interpreter::RedisGenericCommand()
    @     0x5653be276ea1  luaD_precall
    @     0x5653be284ec8  luaV_execute
    @     0x5653be277210  luaD_callnoyield
    @     0x5653be2761ca  luaD_rawrunprotected
    @     0x5653be277560  luaD_pcall
    @     0x5653be273f3a  lua_pcallk
    @     0x5653be146e07  dfly::Service::EvalInternal()
    @     0x5653be14b304  dfly::Service::EvalSha()
    @     0x5653be147f4f  dfly::Service::DispatchCommand()
    @     0x5653be209862  facade::Connection::ParseRedis()
    @     0x5653be20c902  facade::Connection::HandleRequests()
    @     0x5653be22f7e4  util::ListenerInterface::RunSingleConnection()
    @     0x5653be22fa92  _ZN5boost6fibers14worker_contextIZN4util17ListenerInterface13RunAcceptLoopEvEUlvE0_JEE4run_EONS_7context5fiberE
    @     0x5653be22c300  _ZN5boost7context6detail11fiber_entryINS1_12fiber_recordINS0_5fiberENS0_21basic_fixedsize_stackINS0_12stack_traitsEEESt5_BindIFMNS_6fibers14worker_contextIZN4util17ListenerInterface13RunAcceptLoopEvEUlvE0_JEEEFS4_OS4_EPSE_St12_PlaceholderILi1EEEEEEEEvNS1_10transfer_tE
    @     0x7faae91ce19f  make_fcontext
*** SIGABRT received at time=1655328421 on cpu 0 ***
PC: @     0x7faae8c61ce1  (unknown)  raise
    @ ... and at least 1 more frames
Aborted

@romange
Copy link
Collaborator

romange commented Nov 14, 2023

@FZambia I noticed you are planning to write a blog post.

Please let us know if we can assist you with tuning Dragonfly for centrifugal workloads. Additionally, if you come across any performance bottlenecks, please inform us. We're here to help!

@FZambia
Copy link
Author

FZambia commented Nov 20, 2023

@romange hello, thx for this! Looks like I'll need some help soon. For now stepped away from a blog post because I got a bit strange results which I could not quickly explain - but hope to come back to it soon.

First results were generally slower than Redis - but this could be due to benchmark nature. I am extensively using pipelining already, through several connections. Using https://github.com/redis/rueidis library. I am not sure whether this is expected for DF or not? I can provide full script to run my bench.

Another thing is that sometimes performance of DF was dropping from thousands to sth like 200 requests per second - I am not sure why, Redis was super-stable with the same bench.

I was testing with Ubuntu 23.10, think also tried Debian 12 in one run (DigitalOcean CPU-optimized droplets).

My main hope for DF in the perspective of Centrifugo projects is possibility to vertically scale PUB/SUB – with Redis and its sharded PUB/SUB in cluster it's possible, but the model is not very suitable for Centrifugo.

Because Centrifugo should deal with millions of topics and use only several dedicated connections for managing subscriptions. In sharded PUB/SUB case this means that only pre-sharding technique may help to achieve that. I.e. limiting slot space more, to some known value like 32, and create dedicated connection for each such slot - so they land to some nodes of a cluster. Which works but requries more care, planning and may result into unequal load on Redis cluster nodes due to the small number of such shards. And changes approach to creating keys. So an option to get more performance with single Redis instance is nice to have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants