Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AWS resource provider #7927

Closed
wants to merge 2 commits into from

Conversation

laurit
Copy link
Contributor

@laurit laurit commented Feb 28, 2023

Resolves #7775
Ec2 resource provider adds 4s to agent startup, maybe we shouldn't add it to the agent after all or somehow disable it by default?

[otel.javaagent 2023-02-28 11:10:45:442 +0200] [main] DEBUG io.opentelemetry.javaagent.tooling.AgentInstaller - io.opentelemetry.javaagent.tooling.AgentInstaller loaded on io.opentelemetry.javaagent.bootstrap.AgentClassLoader@4cdf35a9
[otel.javaagent 2023-02-28 11:10:47:925 +0200] [main] DEBUG io.opentelemetry.contrib.aws.resource.SimpleHttpClient - SimpleHttpClient fetch string failed.
java.io.InterruptedIOException: timeout
	at okhttp3.internal.connection.RealCall.timeoutExit(RealCall.kt:398)
	at okhttp3.internal.connection.RealCall.callDone(RealCall.kt:360)
	at okhttp3.internal.connection.RealCall.noMoreExchanges$okhttp(RealCall.kt:325)
	at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:209)
	at okhttp3.internal.connection.RealCall.execute(RealCall.kt:154)
	at io.opentelemetry.contrib.aws.resource.SimpleHttpClient.fetchString(SimpleHttpClient.java:71)
	at io.opentelemetry.contrib.aws.resource.Ec2Resource.fetchString(Ec2Resource.java:151)
	at io.opentelemetry.contrib.aws.resource.Ec2Resource.fetchToken(Ec2Resource.java:128)
	at io.opentelemetry.contrib.aws.resource.Ec2Resource.buildResource(Ec2Resource.java:68)
	at io.opentelemetry.contrib.aws.resource.Ec2Resource.buildResource(Ec2Resource.java:49)
	at io.opentelemetry.contrib.aws.resource.Ec2Resource.<clinit>(Ec2Resource.java:35)
	at io.opentelemetry.contrib.aws.resource.Ec2ResourceProvider.createResource(Ec2ResourceProvider.java:16)
	at io.opentelemetry.sdk.autoconfigure.ResourceConfiguration.configureResource(ResourceConfiguration.java:59)
	at io.opentelemetry.sdk.autoconfigure.AutoConfiguredOpenTelemetrySdkBuilder.build(AutoConfiguredOpenTelemetrySdkBuilder.java:335)
	at io.opentelemetry.javaagent.tooling.OpenTelemetryInstaller.installOpenTelemetrySdk(OpenTelemetryInstaller.java:29)
	at io.opentelemetry.javaagent.tooling.AgentInstaller.installBytebuddyAgent(AgentInstaller.java:114)
	at io.opentelemetry.javaagent.tooling.AgentInstaller.installBytebuddyAgent(AgentInstaller.java:94)
	at io.opentelemetry.javaagent.tooling.AgentStarterImpl.start(AgentStarterImpl.java:78)
	at io.opentelemetry.javaagent.bootstrap.AgentInitializer.initialize(AgentInitializer.java:35)
	at io.opentelemetry.javaagent.OpenTelemetryAgent.startAgent(OpenTelemetryAgent.java:57)
	at io.opentelemetry.javaagent.OpenTelemetryAgent.premain(OpenTelemetryAgent.java:45)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:491)
	at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:503)
Caused by: java.net.SocketException: Socket closed
	at java.base/sun.nio.ch.NioSocketImpl.endConnect(NioSocketImpl.java:531)
	at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:615)
	at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
	at java.base/java.net.Socket.connect(Socket.java:633)
	at okhttp3.internal.platform.Platform.connectSocket(Platform.kt:128)
	at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.kt:295)
	at okhttp3.internal.connection.RealConnection.connect(RealConnection.kt:207)
	at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:226)
	at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.kt:106)
	at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.kt:74)
	at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:255)
	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
	... 23 more
[otel.javaagent 2023-02-28 11:10:49:929 +0200] [main] DEBUG io.opentelemetry.contrib.aws.resource.SimpleHttpClient - SimpleHttpClient fetch string failed.
java.net.SocketTimeoutException: Connect timed out
	at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:546)
	at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597)
	at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
	at java.base/java.net.Socket.connect(Socket.java:633)
	at okhttp3.internal.platform.Platform.connectSocket(Platform.kt:128)
	at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.kt:295)
	at okhttp3.internal.connection.RealConnection.connect(RealConnection.kt:207)
	at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:226)
	at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.kt:106)
	at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.kt:74)
	at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:255)
	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
	at okhttp3.internal.connection.RealCall.execute(RealCall.kt:154)
	at io.opentelemetry.contrib.aws.resource.SimpleHttpClient.fetchString(SimpleHttpClient.java:71)
	at io.opentelemetry.contrib.aws.resource.Ec2Resource.fetchString(Ec2Resource.java:151)
	at io.opentelemetry.contrib.aws.resource.Ec2Resource.fetchIdentity(Ec2Resource.java:132)
	at io.opentelemetry.contrib.aws.resource.Ec2Resource.buildResource(Ec2Resource.java:72)
	at io.opentelemetry.contrib.aws.resource.Ec2Resource.buildResource(Ec2Resource.java:49)
	at io.opentelemetry.contrib.aws.resource.Ec2Resource.<clinit>(Ec2Resource.java:35)
	at io.opentelemetry.contrib.aws.resource.Ec2ResourceProvider.createResource(Ec2ResourceProvider.java:16)
	at io.opentelemetry.sdk.autoconfigure.ResourceConfiguration.configureResource(ResourceConfiguration.java:59)
	at io.opentelemetry.sdk.autoconfigure.AutoConfiguredOpenTelemetrySdkBuilder.build(AutoConfiguredOpenTelemetrySdkBuilder.java:335)
	at io.opentelemetry.javaagent.tooling.OpenTelemetryInstaller.installOpenTelemetrySdk(OpenTelemetryInstaller.java:29)
	at io.opentelemetry.javaagent.tooling.AgentInstaller.installBytebuddyAgent(AgentInstaller.java:114)
	at io.opentelemetry.javaagent.tooling.AgentInstaller.installBytebuddyAgent(AgentInstaller.java:94)
	at io.opentelemetry.javaagent.tooling.AgentStarterImpl.start(AgentStarterImpl.java:78)
	at io.opentelemetry.javaagent.bootstrap.AgentInitializer.initialize(AgentInitializer.java:35)
	at io.opentelemetry.javaagent.OpenTelemetryAgent.startAgent(OpenTelemetryAgent.java:57)
	at io.opentelemetry.javaagent.OpenTelemetryAgent.premain(OpenTelemetryAgent.java:45)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:491)
	at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:503)
[otel.javaagent 2023-02-28 11:10:49:933 +0200] [main] DEBUG io.opentelemetry.contrib.aws.resource.EksResource - Not running on k8s.
[otel.javaagent 2023-02-28 11:10:49:964 +0200] [main] DEBUG io.opentelemetry.sdk.internal.JavaVersionSpecific - Using the APIs optimized for: Java 9+

@github-actions github-actions bot requested a review from theletterf February 28, 2023 11:24
@mateuszrzeszutek
Copy link
Member

Is there any env var that we can check before calling that endpoint? E.g. kinda like the ECS resource provider does: https://github.com/open-telemetry/opentelemetry-java-contrib/blob/main/aws-resources/src/main/java/io/opentelemetry/contrib/aws/resource/EcsResource.java#L53

@willarmiros do you know of any environmental info that'd allow early skipping of the HTTP call in the EC2 resource provider?

@willarmiros
Copy link

Unfortunately none that I'm aware of, the IMDS IP address is static so there's no need to store it in an env var like ECS does. There's no mention of env vars in the IMDS docs: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html

My best suggestion would be to optimize the timeout length based on observations of how long it takes to actually reach the IMDS on an EC2 instance (I would expect it to be 10s or 100s of milliseconds, definitely not 4 seconds) and adjusting the timeout to be 2 or 3x that average connection time.

Alternatively, it is known that IMDS is intentionally disabled in Fargate workloads because they obscure the underlying instance. So, we could run ECS/Fargate resource detection first, and if it's a Fargate workload, just skip the EC2 check altogether.

@breedx-splk
Copy link
Contributor

Closing a few old/stale draft PRs. Reopen if interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants