-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: broken memcached error reporting #1741
Conversation
src/facade/reply_builder.cc
Outdated
@@ -277,7 +289,7 @@ void RedisReplyBuilder::SendBulkString(std::string_view str) { | |||
return Send(v, ABSL_ARRAYSIZE(v)); | |||
} | |||
|
|||
std::string_view RedisReplyBuilder::StatusToMsg(OpStatus status) { | |||
std::string_view SinkReplyBuilder::StatusToMsg(OpStatus status) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you change it? that's not good because these errors are not part of memcached protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not true:
949 if (cid->IsTransactional()) {
950 dist_trans.reset(new Transaction{cid});
951
952 if (!dist_trans->IsMulti()) { // Multi command initialize themself based on their mode.
953 if (auto st = dist_trans->InitByArgs(dfly_cntx->conn_state.db_index, args_no_cmd);
954 st != OpStatus::OK)
955 return cntx->SendError(st);
956 }
957
958 dfly_cntx->transaction = dist_trans.get();
959 dfly_cntx->last_command_debug.shards_count = dfly_cntx->transaction->GetUniqueShardCnt();
960 } else {
961 dfly_cntx->transaction = nullptr;
962 }
963 }
Andddd
<< CI{"SET", CO::WRITE | CO::DENYOOM | CO::NO_AUTOJOURNAL
So memcached commands are transactional. Same applies for get
incr
decr
etc. If the weren't, our transactional system would be broken, because memcached commands would not be a part of a transaction whereas redis commands would causing all sorts of broken behavior when interleaved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InitByArgs
always succeeds for these commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We basically validate everything for memcached already at the parser level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really bug prone tbh, some of the error reporting is already handled while some other doesn't(in case of this bug). Anyway, beyond the scope of this PR. I will see at some point how we can simplify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually no, we kinda need this because we need SendError(ErrorReply)
which uses status
.
to be fair, it won't ever arrive to that block that uses status so I ended up removing this and added a dcheck
for our sanity shake. You never know if we hit that so at least let's report this to be sure.
I think we should add a pytest that reproduces the problem. |
src/facade/conn_context.h
Outdated
void SendError(OpStatus status) { | ||
rbuilder_->SendError(status); | ||
} | ||
|
||
void SendSimpleString(std::string_view str) { | ||
rbuilder_->SendSimpleString(str); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One day I'll make ErrorReply's constructor implicit and remove all the overloads 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could be nice.
@kostasrim I added this test to the main branch (locally on my machine) and nothing has crashed. |
@romange Yes because I forgot to check in the changes in
It works because you are not connecting to the replica. Will push in 5 and you should now get a nice error in main:
|
@@ -106,7 +106,7 @@ def admin_port(self) -> int: | |||
def mc_port(self) -> int: | |||
if self.params.existing_mc_port: | |||
return self.params.existing_mc_port | |||
return int(self.args.get("mc_port", "11211")) | |||
return int(self.args.get("memcached_port", "11211")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romange this is why it was passing....
src/server/string_family.cc
Outdated
@@ -976,7 +976,7 @@ void StringFamily::IncrBy(CmdArgList args, ConnectionContext* cntx) { | |||
int64_t val; | |||
|
|||
if (!absl::SimpleAtoi(sval, &val)) { | |||
return (*cntx)->SendError(kInvalidIntErr); | |||
return cntx->SendError(kInvalidIntErr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i know it looks fragile but memcache parser already does all the parsing checks because those conditions should not trigger with memcache.
src/facade/reply_builder.h
Outdated
@@ -57,6 +58,10 @@ class SinkReplyBuilder { | |||
|
|||
virtual void SendProtocolError(std::string_view str) = 0; | |||
|
|||
// You normally should not call this - maps the status | |||
// into the string that would be sent | |||
static std::string_view StatusToMsg(OpStatus status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not your fault but those local fixes make the whole thing more confusing.
Let's do the following:
- Move StatusToMsg to op_status.h as
std::string_view StastusMessage(OpStatus status);
- let's make
SendError(ErrorReply error)
non virtual method in SinkReplyBuilder.
I also changed how SendError is implemented for MC reply builder, I will send you a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's make SendError(ErrorReply error) non virtual method in SinkReplyBuilder.
We can't really do that because of void CapturingReplyBuilder::SendError(ErrorReply error)
actually overrides it with a different implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It overwrites the others as well (string_view), it just becomes a little less efficient because it can't move out of the string IIRC
* feat: implement CONFIG GET command The command returns all the matched arguments and their current values. In addition, this PR adds mutability semantics to each config - whether it can be changed at runtime. Fixes #1700 Signed-off-by: Roman Gershman <[email protected]> --------- Signed-off-by: Roman Gershman <[email protected]>
@@ -251,14 +251,39 @@ TEST_F(RedisReplyBuilderTest, ErrorBuiltInMessage) { | |||
} | |||
} | |||
|
|||
TEST_F(RedisReplyBuilderTest, ErrorReplyBuiltInMessage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romange This is a little "raw" but I wanted to verify that both of the execution paths are covered with ErrorReply
without the extra nuances of printing extra error information through the <<
operator. Moreover, it doesn't cover all sorts possible OpStatus
values and the e1
, e2
are just placeholders such that I can verify that the right execution path gets triggered
Partially solves #1730 (because I suspect there is some other issue as well -- either the user it doing something illegal with the replica or there is some other error which is not reported properly because of this bug).
ConnectionContext
overloadsoperator->
by returning aRedisReplyBuilder
. This is extremely error prone, because there are execution paths that work both withmemcached
andredis
protocol. If used improperly in a memcached execution path it will trigger an internal assertionProtocol::REDIS == protocol()
instead of properly report the error to the client.This PR addresses this, by adding the
SendError
family of functions inConnectionContext
. I replace the(*cntx)->
withcntx->SendError
which uses the internalSinkReplyBuilder
and dispatching to the right function based on the underline protocol.All in all this PR fixes:
Basically, I went over all of the memcached commands just to make sure that we report errors properly via the common base client
SinkReplyBuilder
and not via the specificRedisReplyBuilder
class via the overloaded arrow operator.I really don't like the
operator->
as it's extremely error prone and it's really hard to assert that an execution path is redis only (especially when both->
andreply_builder
are used within the same function -- it makes you scratch your head if it's a bug or an actual feature).Generally speaking a long term solution is to replace all of the common functions to strictly use a common interface via
SinkReplyBuilder
such that are protocol agnostic (since anyway they should be used by both memcached and redis)