Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphql process getting killed after injecting linkerd #5186

Closed
muhammad-fahad2068 opened this issue Nov 8, 2020 · 8 comments
Closed

Graphql process getting killed after injecting linkerd #5186

muhammad-fahad2068 opened this issue Nov 8, 2020 · 8 comments
Labels

Comments

@muhammad-fahad2068
Copy link

Bug Report

Graphql process getting killed after injecting linkerd

What is the issue?

I have a graphql pod, if i inject linkerd with in it, the process starts killing itself,

but when there is no linkerd, it works fine

nginx ingress sending request to graphql service

How can it be reproduced?

create a graphql service, inject linkerd in it

Logs, error output, etc

below are the of graphql container

2020-11-08T11:57:54: PM2 log: App [app:1] exited with code [0] via signal [SIGKILL]
2020-11-08T11:57:54: PM2 log: App [app:1] starting in -cluster mode-
2020-11-08T11:57:54: PM2 log: App [app:1] online
🚀 Server ready at http://localhost:9090/
2020-11-08T12:00:13: PM2 log: App [app:0] exited with code [0] via signal [SIGKILL]
2020-11-08T12:00:13: PM2 log: App [app:0] starting in -cluster mode-
2020-11-08T12:00:14: PM2 log: App [app:0] online
🚀 Server ready at http://localhost:9090/
2020-11-08T19:58:09: PM2 log: App [app:1] exited with code [0] via signal [SIGKILL]
2020-11-08T19:58:09: PM2 log: App [app:1] starting in -cluster mode-
2020-11-08T19:58:09: PM2 log: App [app:1] online

below are the logs of linkerd proxy

[ 29812.351547787s] WARN inbound:accept{peer.addr=x.x.x.x:42156}:source{target.addr=x.x.x.x:9090}: linkerd2_app_core::errors: Failed to proxy request: connection closed before message completed
[ 29930.429372457s] WARN inbound:accept{peer.addr=x.x.x.x:46042}:source{target.addr=x.x.x.x:9090}: linkerd2_app_core::errors: Failed to proxy request: connection closed before message completed

(If the output is long, please create a gist and
paste the link here.)

linkerd check output

➜ ~ linkerd check
kubernetes-api

√ can initialize the client
√ can query the Kubernetes API

kubernetes-version

√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence

√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-config

√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ control plane PodSecurityPolicies exist

linkerd-identity

√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor

linkerd-api

√ control plane pods are ready
√ control plane self-check
√ [kubernetes] control plane can talk to Kubernetes
√ [prometheus] control plane can talk to Prometheus
√ tap api service is running

linkerd-version

√ can determine the latest version
‼ cli is up-to-date
is running version 2.8.1 but the latest stable version is 2.9.0
see https://linkerd.io/checks/#l5d-version-cli for hints

control-plane-version

‼ control plane is up-to-date
is running version 2.8.1 but the latest stable version is 2.9.0
see https://linkerd.io/checks/#l5d-version-control for hints
√ control plane and cli versions match

linkerd-addons

√ 'linkerd-config-addons' config map exists

linkerd-grafana

√ grafana add-on service account exists
√ grafana add-on config map exists
√ grafana pod is running

linkerd-tracing

√ collector service account exists
√ jaeger service account exists
√ collector config map exists
√ collector pod is running
√ jaeger pod is running

Status check results are √
➜ ~

Environment

  • Kubernetes Version: 1.15
  • Cluster Environment: (GKE, AKS, kops, ...) AWS EKS
  • Host OS: ubuntu
  • Linkerd version: Client version: stable-2.8.1
    Server version: stable-2.8.1

Possible solution

I think this is somewhat related to this #3596, #4276

but since we are hosted on AWS EKS, I don't want to modify the control plane

Additional context

graphql service is returning 502 bad gateway with linkerd proxy

without linkerd proxy its working fine

@muhammad-fahad2068
Copy link
Author

below are the debug logs of linkerd proxy

[ 401.981098981s] DEBUG inbound:accept{peer.addr=x.x.x.x:57546}:source{target.addr=x.x.x.x:9090}:http1{name=service.ns.svc.cluster.local:80 port=9090 keep_alive=true wants_h1_upgrade=false was_absolute_form=false}:profile{addr=service.ns.svc.cluster.local:80}:target:http1: hyper::proto::h1::io: read 0 bytes
[ 445.157165792s] DEBUG inbound: h2::codec::framed_read: received; frame=GoAway { last_stream_id: StreamId(0), error_code: NO_ERROR }
[ 445.157200322s] DEBUG inbound: rustls::session: Sending warning alert CloseNotify
[ 445.157342133s] DEBUG inbound: tokio_reactor: dropping I/O source: 24
[ 445.158296337s] DEBUG inbound: linkerd2_app_core::accept_error: Connection failed error=connection error: Transport endpoint is not connected (os error 107)

@Pothulapati
Copy link
Contributor

I think, the proxy is trying to connect to an external API that your application is calling and by the time the proxy is able to reply your application is crashing, and hence the proxy is returning that the transport is not connected to reply. Is there a way to get more logs from the graphql process on what's happening at the request level.
Some related things that I found on the internet

@grampelberg
Copy link
Contributor

service.ns.svc.cluster.local:80 port=9090

I'm wondering if this is an issue with your :authority header. That should be :9090 and not :80.

@muhammad-fahad2068
Copy link
Author

@grampelberg below is the service manifest

apiVersion: v1
kind: Service
metadata:
name: graphql
spec:
ports:
- name: graphql-80
port: 80
targetPort: 9090

and below is the container port in pod spec

ports:
- containerPort: 9090

The request will come on 80 port and will eventually to 9090

without linkerd its working fine, if that was the case it would not have worked annyway

@grampelberg
Copy link
Contributor

What language is your GraphQL server written in? What framework are you using?

@grampelberg
Copy link
Contributor

If you could put a set of replication steps together for us, it'd be super helpful.

@muhammad-fahad2068
Copy link
Author

We are using apollo graphql in JS with apollo federation

@stale
Copy link

stale bot commented May 21, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label May 21, 2021
@stale stale bot closed this as completed Jun 4, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants