-
Notifications
You must be signed in to change notification settings - Fork 33
Stackdriver logging #68 #81
Conversation
Note that even with the threadlocal protection, turning logging to FINE globally is preventing deployment, but without an indication why. It may just be load and the instance takes too long to respond. Note that netty appears to have it's own internal SLF4J logger impl that is correctly getting logged, but is taking its FINE setting from the global setting rather than an Selecting turning on FINE logging for some components does work. |
I have worked out why deployment fails when
Even thought the logging configuration is:
So I may put in an explicit filter to remove those netty debug records? Also we can consider if we want to monitor low memory and stop generating verbose logs? thoughts? |
@meltsufin updated to use the same trace id label as nginx log |
@gregw Even at the default logging level, it looks like we get two log messages from grpc for each log message that we send. I think during the actual remote logging, we should be suppressing all logs from google-cloud-logging. Is this feasible?
|
static final Logger logger = Logger.getLogger(RequestContextScope.class.getName()); | ||
|
||
private static final String X_CLOUD_TRACE = "x-cloud-trace-context"; | ||
private static final ThreadLocal<String> traceid = new ThreadLocal<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use camel case traceId
.
if (id != null) { | ||
int slash = id.indexOf('/'); | ||
if (slash >= 0) { | ||
id = id.substring(0, slash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we just dropping the span id? I'm not sure this is what we're supposed to be doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not sure. The compat version we inherited did a scan for the first non hex character and split there, so I just simplified and inherited that behaviour. Can you investigate if we can leave it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will find out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got a response from the logging team indicating that we should only log the trace id without the span id. So, this is good.
import java.util.logging.Level; | ||
import java.util.logging.Logger; | ||
|
||
public class RequestContextScope implements ContextHandler.ContextScopeListener { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some javadocs to explain the purpose of the class?
* @param traceid The trace ID | ||
* @param runnable the runnable | ||
*/ | ||
public static void runWith(String traceid, Runnable runnable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this method used somewhere, or just provided as a utility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a utility that I thought we might need. I will drop for now as it will be trivial to readd if there ever is a need.
@Override | ||
public synchronized void publish(LogRecord record) { | ||
// check we are not already flushing logs | ||
if (Boolean.TRUE.equals(flushing.get())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a matter of style, but I would save a line and write it as:
// check we are not already flushing logs
if (!Boolean.TRUE.equals(flushing.get())) {
super.publish(record);
}
*/ | ||
public TracingLogHandler(String logName, LoggingOptions options, MonitoredResource resource) { | ||
super(logName, options, resource); | ||
monitored = MonitoredResource.newBuilder("gae_app") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're sort of overwriting the resource that is passed in here. Maybe create the resource only in the no-arg constructor instead?
@@ -0,0 +1,9 @@ | |||
.level = INFO | |||
io.netty.level = INFO | |||
io.gcr.level = INFO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to set log levels for netty and gcr here if we have the flushing guard in TracingLogHandler.publish()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it because netty logs from background threads. However there is still the issue that not all of netty's log mechanism appears not to see these settings anyway, so perhaps we just need an explicit filter in the publish method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would an explicit filter work so that it deals with the Netty background threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the approach taken by the gcloud-java client library is to have fixed list of known log names that it explicitly scans for and removes the stackdriver handler from.
This is the approach that should also stop looping, but it not.
The same list could simply be applied within the log handler and any logs from those handlers would be discarded. This list could either be a complete list, in which case we would not need either the threadlocal approach nor the handler deregistration.
Alternately the list can just be limited to the loggers that are not reacting to the logging properties configuration.
Currently I don't like any solution. My preference would be to have no filtering and just loop protection. Overly verbose logging should be handled gracefully (with discarded logs and a message to that effect) and then the user can just configure logging to only log what they are interested in.
So I need to investigate why those netty classes are ignoring the properties file.
@meltsufin this is still a work in progress as I would like to:
|
@meltsufin I measured memory usage more accurately to be around 4KB per log. I have raised googleapis/google-cloud-java#1449 to see if that can be reduced and/or some limits put on the size of the queued logs. |
We're really blocking on 2 things:
|
Noting that a related issue was filed against GRPC: grpc/grpc-java#2515 |
@meltsufin The https://github.com/GoogleCloudPlatform/jetty-runtime/tree/master-68-docker-args branch is based on this branch and now includes:
The #41 branch for sessions is based of this new branch. |
New image has been released and is available at I just opened #157 to bump the base image. |
@meltsufin I'm currently seeing problems with the classloading tests when I run them. These tests were passing, but are now failing, so I need to investigate a bit more. Probably wont get a chance until Tuesday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're ready to update the GaeFlexLoggingEnhancer
references.
Provides SLF4J API. Requires a slf4j implementation (eg slf4j-simple-impl) | ||
otherwise a noop implementation is used. | ||
|
||
TODO Remove once jetty-9.5.4 released |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean jetty-9.4.5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops yes.
pom.xml
Outdated
@@ -29,13 +29,14 @@ | |||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | |||
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding> | |||
<maven.build.timestamp.format>yyyy-MM-dd-HH-mm</maven.build.timestamp.format> | |||
<gcloud.api.version>0.13.0-beta</gcloud.api.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.17.1 is out now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, on it now. Hopefully ready tomorrow (as will a 9.4.5 release with simpler class hiding)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@meltsufin I can't see a 0.17.1 in maven central. But I can see a 1.0.0! http://repo2.maven.org/maven2/com/google/cloud/google-cloud-logging/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, even better! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@meltsufin I just tried 1.0.0, but it failed with this issue: googleapis/google-cloud-java#1994
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix has been merged, please retry when you get a chance.
…ime into master-68
README.md
Outdated
com.google.cloud.logging.LoggingHandler.level=FINE | ||
com.google.cloud.logging.LoggingHandler.log=gae_app.log | ||
com.google.cloud.logging.LoggingHandler.resourceType=gae_app | ||
com.google.cloud.logging.LoggingHandler.enhancers=com.google.cloud.logging.GaeFlexLoggingEnhancer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be TraceLoggingEnhancer
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is running on GAE Flex, you no longer need to specify the resourceType and enhancers. it gets auto-detected based on the GAE_INSTANCE env var. TraceLoggingEnhancer is automatically added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Thanks Jisha!
sun.net.level=INFO | ||
|
||
handlers=com.google.cloud.logging.LoggingHandler | ||
com.google.cloud.logging.LoggingHandler.level=FINE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the TraceLoggingEnhancer
be here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It is now automatic and I have updated the README.
@meltsufin while this PR is looking close to ready, @janbartel is having problems merging with her session work because of dependency strangeness in |
Thank you for your hard work on this Greg and Mike! This makes the Flexible experience much better. |
@gregw Something in this PR broke our integration with the cloud debugger. I suspect it's the class hiding. Investigating... |
The implementation of #68 is now working with the latest google-cloud-logging API supporting the changes we requested.
Currently we still have threadlocal loop protection, which I will now test if it is still needed.