-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logs, log levels #797
Logs, log levels #797
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say we should bring "received a new message" from tracing back to debug, but that is indeed spammy. But we kinda need a way to figure out if a node is able to receive messages and was receiving messages.
@ppca @ChaoticTempest Overall, I think we should be looser with debug logs. They should give as full understanding of what is going on in our system, not just the "highlights". Info is for highlights, it should work fine once we have log filtering. |
@ppca @ChaoticTempest I've added more logs, and removed or moved some of them. Changed log levels, and now we should be able to use them in GCP. |
Adding logs look good to me. But we still need to enable toggling for this to work in GCP? If you want to see these logs now without the work of enabling toggle, all the trace!("") won't show up in GCP FYI. |
@@ -304,7 +315,7 @@ impl MpcSignProtocol { | |||
let from_state = format!("{state}"); | |||
state = match state.advance(&mut self, contract_state).await { | |||
Ok(state) => { | |||
tracing::debug!("advance ok: {from_state} => {state}"); | |||
tracing::trace!("advance ok: {from_state} => {state}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if @ppca will like this because it might not be visible that we're advancing states in our normal debug logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made it trace because it is too spammy. We should fix it later with logging spans.
self.gc | ||
.retain(|_, instant| instant.elapsed() < Duration::from_millis(cfg.garbage_timeout)); | ||
let removed = before.saturating_sub(self.gc.len()); | ||
if removed > 0 { | ||
tracing::warn!("garbage collected {} presignatures", removed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't say this is a warning though, since this is purely informational where warning should be reserved for logs that should close to never happen. So it should be info
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should minimize the number of garbage collections since it is a waste and an indicator that our system is not efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're misunderstanding what garbage collection is. Garbage collections happens no matter what, as in every single triple, presignature and signature will eventually be garbage collected and this will always be called. It is our way to keep tally of what IDs to keep around until messages stop from other protocol that linger behind or are delayed due to restarts or other misc behaviors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kk, I got it
what I meant is that we should keep track of gcd triples, pre-signatures that were not used for any reason
crate::metrics::NUM_SIGN_REQUESTS_MINE | ||
.with_label_values(&[my_account_id.as_str()]) | ||
.inc(); | ||
if is_mine { | ||
crate::metrics::NUM_SIGN_REQUESTS_MINE | ||
.with_label_values(&[my_account_id.as_str()]) | ||
.inc(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is correct. @ppca is this the right metric for MINE sign requests or does this include requests that we are a part of the signer subset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an intentional change. I believe that it is more important now to have a metric for the proposer of the signature. But we can have both. Also, "MINE" semantics is closer to me=proposer.
tracing::warn!("signature protocol timed out completely"); | ||
return Err(ProtocolError::Other( | ||
anyhow::anyhow!("signature protocol timeout completely").into(), | ||
anyhow::anyhow!("signature protocol timed out completely").into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't rename it timed out
because it's much easier to search for timeout
even though timed out
is more grammatically correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but we have a lot of "timed out" logs. We should unify it all.
self.completed.retain(|_, timestamp| { | ||
timestamp.elapsed() < Duration::from_millis(cfg.signature.garbage_timeout) | ||
}); | ||
let garbage_collected = before.saturating_sub(self.completed.len()); | ||
if garbage_collected > 0 { | ||
tracing::warn!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here with this garbage collection
self.gc.retain(|_, timestamp| { | ||
timestamp.elapsed() < Duration::from_millis(cfg.garbage_timeout) | ||
}); | ||
let garbage_collected = before.saturating_sub(self.gc.len()); | ||
if garbage_collected > 0 { | ||
tracing::warn!("garbage collected {} triples", garbage_collected); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracing::warn!("garbage collected {} triples", garbage_collected); | |
tracing::info!("garbage collected {} triples", garbage_collected); |
Logs are opinionated. I want us to start discussing how, where, and what we are logging.