Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix server: fix last error reply #3728

Merged
merged 3 commits into from
Sep 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions src/facade/reply_capture.cc
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ void CapturingReplyBuilder::SendMGetResponse(MGetResponse resp) {
}

void CapturingReplyBuilder::SendError(OpStatus status) {
if (status != OpStatus::OK) {
last_error_ = StatusToMsg(status);
}
SKIP_LESS(ReplyMode::ONLY_ERR);
Capture(status);
}
Expand Down Expand Up @@ -215,6 +218,8 @@ void CapturingReplyBuilder::Apply(Payload&& pl, RedisReplyBuilder* rb) {

CaptureVisitor cv{rb};
visit(cv, std::move(pl));
// Consumed and printed by InvokeCmd. We just send the actual error here
std::ignore = rb->ConsumeLastError();
}

void CapturingReplyBuilder::SetReplyMode(ReplyMode mode) {
Expand Down
4 changes: 4 additions & 0 deletions src/server/main_service.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1172,6 +1172,8 @@ std::optional<ErrorReply> Service::VerifyCommandState(const CommandId* cid, CmdA
}

void Service::DispatchCommand(CmdArgList args, facade::ConnectionContext* cntx) {
absl::Cleanup clear_last_error(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this here? what are the cases that we ignore?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any error that has to do with parsing the arguments, unknown command etc -- we do the same in InvokeCmd. These are not interesting to us.

An example from a few lines below:

  1210   if (auto err = VerifyCommandState(cid, args_no_cmd, *dfly_cntx); err) {
  1211     if (auto& exec_info = dfly_cntx->conn_state.exec_info; exec_info.IsCollecting())
  1212       exec_info.state = ConnectionState::ExecInfo::EXEC_ERROR;
  1213 
  1214     // We need to skip this because ACK's should not be replied to
  1215     // Bonus points because this allows to continue replication with ACL users who got
  1216     // their access revoked and reinstated
  1217     if (cid->name() == "REPLCONF" && absl::EqualsIgnoreCase(ArgS(args_no_cmd, 0), "ACK")) {
  1218       server_family_.GetDflyCmd()->OnClose(dfly_cntx);
  1219       return;
  1220     }
  1221     dfly_cntx->SendError(std::move(*err));
  1222     return;
  1223   }

SendError will register an error message for us to print but 1. it's not really interesting 2. it won't be printed at the right time.

P.s. we do the same in InvokeCmd() for similar paths.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean these are not interesting for us?
we want to see error messages for commands sent to dragonfly, so if we have problem we will know the command and the params sent and it will be easy to reproduce the problem and to know if this is bug on our side or on client side.

Copy link
Contributor

@kostasrim kostasrim Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because those paths are not bugs related to commands, they are "soft errors". For example, acl authentication failed has nothing really interesting for us. The person could not execute the command because of their ACL's. If for some reason the client thinks that a command got rejected because of a bug they can file an issue. On the other hand, I most certainty don't think it would be useful to have a bunch of warnings because a client could not execute something that they have not credentials for. Similarly, for VerifyCommandState I don't think it's interesting let's say that a user could not run a command because the command is not allowed to be run under a script. These are soft errors and if we do have bugs, they can file the issue and it will be immediately apparent. Same applies for a command that gets rejected for arity (because the user did not specify enough arguments). I basically classify all of these errors are "soft" and I wanted to avoid logging them because even if there are bugs it will be easy to spot them. Furthermore, we log every second and that means that an ACL error will hide a command error and I believe the second is more important

[cntx]() { std::ignore = cntx->reply_builder()->ConsumeLastError(); });
CHECK(!args.empty());
DCHECK_NE(0u, shard_set->size()) << "Init was not called";

Expand Down Expand Up @@ -1375,6 +1377,8 @@ bool Service::InvokeCmd(const CommandId* cid, CmdArgList tail_args, ConnectionCo
ReplyGuard reply_guard(cntx, cid->name());
#endif
uint64_t invoke_time_usec = 0;
auto last_error = cntx->reply_builder()->ConsumeLastError();
DCHECK(last_error.empty());
try {
invoke_time_usec = cid->Invoke(tail_args, cntx);
} catch (std::exception& e) {
Expand Down
1 change: 1 addition & 0 deletions src/server/multi_command_squasher.cc
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ bool MultiCommandSquasher::ExecuteStandalone(StoredCmd* cmd) {
if (verify_commands_) {
if (auto err = service_->VerifyCommandState(cmd->Cid(), args, *cntx_); err) {
cntx_->SendError(std::move(*err));
std::ignore = cntx_->reply_builder()->ConsumeLastError();
return !error_abort_;
}
}
Expand Down
Loading