-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NoNodeAvailableException after 2 hours of bulk indexing #1868
Comments
We also receive this exact same problem - seemingly randomly with ES version 0.19.2 on both server and client. We have been unable to pin point down the problem but we have tried running multiple clusters, a single cluster, etc on the server side and it doesn't seem to effect anything. It happens almost daily for us and we're looking for ways to narrow down why this is. Using ES head, our cluster health is green and there is nothing that appears out of the ordinary. The exception I received just a few moments ago: org.elasticsearch.client.transport.NoNodeAvailableException: No node available |
The other exception we get that results in a NoNodeAvailable Exception is this: org.elasticsearch.client.transport.NoNodeAvailableException: No node available |
I'm curious, are you using the transport client in sniff mode or in regular mode? |
@holdenk -- @GrantGochnauer is using ES in regular mode. |
You might want to try it with sniff mode turned on and see if it still happens. I can take a look at the regular mode code, but I know we used to see it in sniff mode and then fixed a bug for in the sniff mode transport and it worked. |
@holdenk thanks for the advice, we'll give it a try. |
@holdenk - Unfortunately this problem still happens with what seems to be the same frequency as it did before turning sniff mode on. |
As a workaround I just coded, that my Indexer shall retry, if such an Exception occured: while (true) { |
Which version are you using? Also, can you turn on logging to TRACE on org.elasticsearch.client.transport to see why it gets disconnected? |
I used elasticsearch 0.19.2 I'm not sure - but this might be caused due to our networking components (sometimes some SSH sessions get disconnected without any obvious reason - ssh is configured with keep-alive). For example: |
We are also using 0.19.2 and get the NoNodeException when both the ES server process and ES client process are on the same physical machine without any network requirements. We're trying to use snazzy's Thread.sleep code above as a work-around. We've also enabled Trace logging. |
It would be interesting to see the logs that I asked for... |
@kimchy - This seems related: 13:29:23.306 [New I/O client worker #1-3] WARN org.elasticsearch.transport.internalWarn[104] - [Aardwolf] Received response for a request that has timed out, sent [9546ms] ago, timed out [1526ms] ago, action [cluster/nodes/info], node [[Damian, Margo][UIg42PPiS0SE2X_HO3wLFA][inet[/192.168.150.62:9301]]], id [1831] |
Yea, so timeout is not a disconnection... . Are you using sniff or not now? Turn off sniffing, add several nodes, and use 0.19.3, lets see if you still get it? (there are more improvements when sniff is set in 0.19.4 coming up). |
Hello @kimchy, We're using 0.19.3 and sniff is turned on. I put together a simple reconnect piece of code that attempts to reconnect a few times before giving up. Today we saw an interesting stacktrace that we haven't seen before: 08:54:39.694 [http-bio-8080-exec-65] WARN c.v.p.s.r.ElasticSearchRepository - NoNodeAvailableException caught, retrying request. |
@jereanon is there a chance that you can recreate it? |
FWIW, I was experiencing the NoNodeAvailableException during bulk indexing with 0.19.8. Even after reducing the batch size from 5000 down to 1000, the exception was still occurring with ES and the indexer running locally. After updating to 0.19.1 bulk indexing is working flawlessly in batches of 5000. |
is anyone of the participants of this issue still having these kind of problems with a current version and is able to provide more verbose information/logfiles, we could use to track this down? |
After upgrading ES from 19.x to 20.x, I no longer have the issue. On Fri, Aug 9, 2013 at 5:05 AM, Alexander Reelsen
|
Closing due to feedback. Happy to reopen if this still happens with the current release! Thanks for feedback. |
Currently happening with ES server 0.90 and client api 0.90.7 . I'll try increasing the log levels and I might try to turn on the sniff mode. From what I've googled few people think it can be version mismatch, is that possible? |
I am now getting this in version 1.1.1. After about an hour of bulk indexing of 7 million docs in batches of 1000 into a 3 node cluster with one shard replicated to the other two nodes. |
We are also getting similar errors(version 1.0.1,We are using single node Elasticsearch), we are trying to index events from Flume using ElasticsearchSink. Events are being indexed successfully for some time and then it starts giving the exceptions, please find below the exception stack trace. Unable to deliver event. Exception follows. org.elasticsearch.client.transport.NoNodeAvailableException: No node available
|
Some of these issues might be caused by a bug in the transport client retry mechanism, see #6829 . What happens there is that when some nodes drop the transport client doesn't necessarily retry the request with all nodes among the connected ones. |
Please join us in #elasticsearch on Freenode or at https://discuss.elastic.co/ for troubleshooting help, we reserve Github for confirmed bugs and feature requests :) |
Hi, Does anybody had this problem? Or could someone give some ideas how to fix this? Code:- CsvToJson jsonDocument = new CsvToJson();
Error:- 2015-06-19 12:55:20 INFO plugins:104 - [Jones, Gabe] loaded [], sites [] |
while indexing through java programme, i am getting org.elasticsearch.client.transport.NoNodeAvailableException: No node available can anybody tell me solution, mail me if any suggetion related to this problem |
Just read this comment and don't hijack old threads: #1868 (comment) |
Getting following trace when trying to close bulkProcessor. 2016-07-05T12:13:30,555 [-] [elasticsearch[EsClient][generic][T#1]] [] [] WARN transport [EsClient] failed to execute failure callback on [org.elasticsearch.action.bulk.Retry$AsyncRetryHandler@3b03ff0], failure [NoNodeAvailableException[None of the configured nodes were available: [{Pietro Maximoff}{kUf52ZmwTXSuY-UBmwLF6w}{127.0.0.1}{localhost/127.0.0.1:9300}]]; nested: NodeDisconnectedException[[Pietro Maximoff][localhost/127.0.0.1:9300][indices:data/write/bulk] disconnected];] |
Fixed after using sniff mode. Tried moving back to regular mode to reproduce but failed to reproduce. |
Hi,
I continuously receive the following exceptions after my bulk indexer runs for appox 2 hours.
I'm using a cluster with 4 elasticsearch nodes and all nodes were always running. One process issues bulk request with 100 index requests each with a throughput of about 1000~2000 docs per second.
The elasticsearch server log files say nothing.
org.elasticsearch.client.transport.NoNodeAvailableException: No node available
at org.elasticsearch.client.transport.TransportClientNodesService$RetryListener.onFailure(TransportClientNodesService.java:214)
at org.elasticsearch.client.transport.TransportClientNodesService$RetryListener.onFailure(TransportClientNodesService.java:220)
at org.elasticsearch.client.transport.TransportClientNodesService$RetryListener.onFailure(TransportClientNodesService.java:220)
at org.elasticsearch.client.transport.TransportClientNodesService$RetryListener.onFailure(TransportClientNodesService.java:220)
at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:182)
at org.elasticsearch.client.transport.support.InternalTransportClient.execute(InternalTransportClient.java:97)
at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:141)
at org.elasticsearch.client.transport.TransportClient.bulk(TransportClient.java:295)
at org.elasticsearch.action.bulk.BulkRequestBuilder.doExecute(BulkRequestBuilder.java:128)
at org.elasticsearch.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:53)
at org.elasticsearch.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:47)
The text was updated successfully, but these errors were encountered: