-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing HTTP status codes for cancelled search requests #73849
Comments
Pinging @elastic/kibana-security (Team:Security) |
Pinging @elastic/kibana-app-arch (Team:AppArch) |
we see a similar problem - i tried the It happens for example every few clicks on the "view single document"-button at a message. sometimes it even says that there is no document with that id - a refresh then works. sometimes we also get following error-message "proxy error" with following log-output
the apache access and error logs look like
It also happens connecting directly to Kibana without using the in-between apache reverse-proxy Happens with latest Firefox 79.0 and Chrome 84.0.4147.125 on windows 10 With ELK 6.8 we did not have those problems. |
@BBQigniter - I have been seeing the errors even after adding the ProxyBadHeader setting. I think there are two things happening here - some responses are blank with the missing header (ProxyBadHeader should resolve those), and some responses are being terminated prematurely by Kibana (connection reset by peer) - I think these are giving the remaining errors. The errors are less common when setting ProxyBadHeader, but it could also be a placebo. This is a pretty glaring issue since errors popping up while using Kibana are not at all ideal. |
yes, those errors are super annoying and thankfully we only have those currently on our test-cluster where I'm testing the upgrade procedure. Just upgraded whole cluster to version 7.9 - same errors :( |
@joshporter1 In terms of the search feature, unnecessary requests (for example when running a new search before the previous one had finished) get canceled, from the browser, using an It is typical for a request canceled in flight, to show an empty response body and no status code (status code 0). Here is a command line example fetching and cancelling a request to some random online API in Chrome. I was surprised however to find out that Firefox doesn't mark the request as canceled, but it's still the same scenario: So anyway, the empty responses are part of a normal workflow. I wonder how we can work around this in a reverse proxy scenario however. |
@lizozom I kind of figured that's what was happening, the fact that Firefox wasn't showing it as such is a tad confusing. However, Kibana still shows errors in other browsers such as Chrome and Brave since the reverse proxy server returns 502's on some of the aborted requests. I'm hoping to find a good solution to solve this at the reverse proxy layer but I've combed through the docs a few times and haven't found anything useful. |
Can you share some of the resources you've been looking into? I'm curious to try and do this setup, to see if I can reproduce the issue. |
Sure thing, I've mostly been digging through the mod_proxy docs at the options available for the Apache reverse proxy. The |
Like I wrote in my first comment, I saw "proxy errors" and or "no document with that id" also while pointing my browser directly to Kibana without going through the reverse proxy. It only happens not that often and is hardly reproducable. The next thing I sometimes experience is, that clicking on links etc. sometimes does not work correctly like in this gif. This also happens with and without reverse proxy usage. If this happens I have to refresh the browser tab in Chrome or Firefox. Kibana 7.9 |
@BBQigniter can you please explain what the gif should be showing? |
@lizozom sure. sry, if it's not clear
|
I now found this wonderful blog https://newweaver.com/the-mystery-of-the-502-kibana-errors/ This guy describes EXACTLY the same problem!
and restarted Kibana. I now have that "fix" for a few minutes and didn't get any proxy errors for some time. Hopefully that's it. |
@BBQigniter thanks for that link - please keep me updated on if it does the trick! |
@joshporter1 it seems the fix really helps, no "proxy error" issues yet while clicking around. Only issues like in the gif and later comment described sometimes still happen. in the linked blog the guy also mentions #44062 so if elastic updates the underlying nodejs to >v12.17.0 it should be fixed without the need to add the line to the function |
We constantly update nodejs versions, so I think its safe to assume this will be resolved soon 🙏 |
@joshporter1 any updates on this? Can this be closed? |
@lizozom I applied the workaround "fix" but have not updated or removed the code to verify if it was solved in a future version. So as far as I can tell, it shouldn't be closed unless it was truly resolved without requiring users to alter the core code. If the underlying nodejs version was upgraded then the issue should probably be re-evaluated. |
As far as I can tell, this bug is being caused by nodejs/node#27363. Once Kibana is upgraded to Node > 12.9.0, this will be resolved. There are also potential workarounds that are discussed later. Node.js has three timeouts that apply to HTTP requests:
There's a bug in Node.js < 12.9.0 where when multiple HTTP requests occur over the same TCP socket when keepalives are used, that the headers timeout is not reset correctly. This is what the headers timeout is supposed to control: But with the bug, this is how it's used: I've been able to replicate this behavior against a test Node.js HTTP server, and Kibana itself using the default timeouts and a basic test client, which is included below. This effectively means that Kibana is pretty much unusable currently behind a reverse-proxy that reuses TCP sockets. const http = require('http');
const agent = new http.Agent({
keepAlive: true,
keepAliveMsecs: 30000
});
let n = 0;
const request = () => {
const req = http.request({
protocol: 'http:',
host: 'localhost',
port: 5601,
path: '/',
method: 'GET',
agent,
}, function (res) {
let data = '';
res.on('data', (chunk) => {
data += chunk;{}
});
res.on('end', () => {
console.log(`Done: [${res.statusCode}] - ${data}`);
if (++n == 1) {
setTimeout(request, 58000);
}
})
});
req.on('socket', (socket) => {
socket.write("GET / HTTP/1.1\r\n");
setTimeout(() => {
socket.write(`user-agent: ${Math.random() * 1000}\r\n`);
}, 1000);
setTimeout(() => {
socket.write(`user-agent: ${Math.random() * 1000}\r\n`);
}, 2000);
setTimeout(() => {
socket.write('\r\n')
req.end();
}, 3000);
});
req.on('error', (error) => {
console.error('error', error);
});
};
request(); Potential workaroundsI don't really like any of these, but I'm interested in hearing your opinions @restrry. Allow the end-user to configure the headers timeoutThis would allow the end-user to work-around this bug when Kibana is behind a reverse-proxy, and set it to a value higher than the Automatically configure the headers timeout to be larger than the keepalive timeoutThis would still have the nondeterministic behavior as above, but we at least don't end up violating end-users expectations when we do get to upgrade to Node 12.9.0. This will potentially lead to the headers timeout being set to a much larger value than it would be by default, minimizing the protection that it provides against the [slowloris attack](The headers timeout was originally added to prevent against the slowloris attack so we end up removing this protection. ). Disable the headers timeoutIt's possible to disable the headers timeout by setting this to 0. This removes the protection entirely against the slowloris attack. Removing this protection for users who aren't reusing TCP keepalives and reusing sockets is a potential change in expectation if they were relying on Kibana itself providing this protection. |
I read through the nodejs thread nodejs/node#27363 and it seems that:
@kobelb WDYT about this? const headersTimeout = listenerOptions.keepaliveTimeout + 2*server.listener.headersTimeout;
server.listener.headersTimeout = headers Timeout; |
@restrry I think your suggestion is the "least bad" option that we have, unfortunately... I'll take it :) |
IMO yes, but I'd let @kobelb confirm as he did most of the research |
Everything is looking good from my perspective. I just tested out |
Kibana version: 7.8.1
Elasticsearch version: 7.8.1
Server OS version: Ubuntu 16.04.6 LTS
Browser version: Firefox 79.0b9
Browser OS version: macOS Catalina 10.15.6
Original install method (e.g. download page, yum, from source, etc.): apt
Describe the bug: Some responses from
/internal/search/es
respond with empty status code, header, and body. Presumably from cancelled queries.Steps to reproduce:
Expected behavior:
Network log should contain proper queries to the ES API with corresponding responses. Cancelled queries should respond with a proper HTTP status code (204 seems appropriate).
Screenshots (if relevant):
Errors in browser console (if relevant):
Failed to load resource: the server responded with a status of 502 (Proxy Error)
- this only occurs when Kibana is reverse proxied.Any additional context:
This is primarily an issue when reverse proxying Kibana. Apache fusses when it receives an upstream response that doesn't have a valid header or is missing the response code.
A temporary remediation for this with Apache is setting
ProxyBadHeader Ignore
in the server config. Obviously this is not the ideal solution, since the cancelled inflight requests should return a valid HTTP status code despite responding with no content.Edit: The ProxyBadHeader fix doesn't fully resolve the problem. Not 100% why but I speculate that the upstream server is both sending empty responses and terminating requests early.
The text was updated successfully, but these errors were encountered: