Various cleanups for Elasticsearch test #51

dakrone · 2015-04-07T23:59:14Z

This change makes four changes:

Adding a logging configuration

In order to actually determine the root cause of issues, more verbose logging is needed. This defaults to more verbose logging for Elasticsearch and adds the ability to change it from Jepsen in the future (instead of manually by hand).

Moving nuke! to before tests

Without this, Jepsen deletes all traces of itself after running, which makes debugging much more difficult (no logs and no data left).

Use more reasonable settings for scroll

An optional change, but I figured I would make this change anyway.

Wait for index to become green after creation

This is a key part of testing Elasticsearch, and clients should always do this when creating indices.

Note that I was able to reproduce the failures in elastic/elasticsearch#10426 (about half of the time) without these changes, however after the change which waits for green after index creation, I am no longer able to reproduce data loss with the create-pause test (still evaluating the other tests).

This increases the default logging to DEBUG, and sets TRACE logging for gateway and discovery packages.

This allows someone to collect the ES logs and data *after* a test run. Otherwise the logs and data is removed and ES is stopped, making further debugging impossible.

After an index is created, clients should always wait for the index to be fully created (the request returns immediately) before starting the test.

No need for the `query_then_fetch` setting, use ten seconds instead of one minute, and a more reasonable size of 20 rather than 2.

dakrone · 2015-04-08T00:00:33Z

@aphyr also, how would you feel about me replacing Elastisch with vanilla clj-http? Elastisch uses clj-http internally anyway and it would reduce the number of moving parts in this test. I'm happy to submit another PR if you are interested.

aphyr · 2015-04-15T17:31:17Z

elasticsearch/src/elasticsearch/core.clj

-              (throw (RuntimeException. err))))))
-
+                    :settings {"index" {"refresh_interval" "1s"}})
+        (catch Throwable t))


I'd really prefer to know about connection errors etc that happen here; the only reason it's appropriate to noop is if the index already exists.

Sure, I will remove this to only handle IndexAlreadyExistsException as you previously did. I do think using basic clj-http would be easier, as it'd allow you to use:

(try+ ... (catch [:status 400] ;; ignore

Instead of catching any throwable during index creation, catch a specific exception and ensure it was only because the index already existed. Additionally, this changes the hardcoded node count of 5 to `(count (:nodes test))` so a dynamic number of nodes can be used.

dakrone · 2015-04-17T19:39:17Z

Pushed another commit addressing your feedback, thanks for taking a look!

aphyr · 2015-04-17T19:40:52Z

Excellent, thanks @dakrone :)

Various cleanups for Elasticsearch test

aphyr · 2015-04-28T01:52:14Z

Do these tests run for you? Elasticsearch doesn't even start on my nodes any more; times out waiting for cluster recovery.

dakrone · 2015-04-28T02:56:53Z

@aphyr I just double-checked this and the tests are still running for me

Various cleanups for Elasticsearch test

dakrone added 4 commits April 7, 2015 15:26

Add logging configuration

aa66375

This increases the default logging to DEBUG, and sets TRACE logging for gateway and discovery packages.

Move nuke! to before the test starts

efc15e4

This allows someone to collect the ES logs and data *after* a test run. Otherwise the logs and data is removed and ES is stopped, making further debugging impossible.

Wait for index to become green after creation

22a50b2

After an index is created, clients should always wait for the index to be fully created (the request returns immediately) before starting the test.

Use more reasonable settings for scroll

b1cb921

No need for the `query_then_fetch` setting, use ten seconds instead of one minute, and a more reasonable size of 20 rather than 2.

bleskes mentioned this pull request Apr 10, 2015

A VM pause (due to GC, high IO load, etc) can cause the loss of inserted documents elastic/elasticsearch#10426

Closed

aphyr reviewed Apr 15, 2015
View reviewed changes

aphyr added a commit that referenced this pull request Apr 17, 2015

Merge pull request #51 from dakrone/es-cleanups

8bbd973

Various cleanups for Elasticsearch test

aphyr merged commit 8bbd973 into jepsen-io:master Apr 17, 2015

aphyr added a commit that referenced this pull request Aug 23, 2016

Merge pull request #51 from dakrone/es-cleanups

159a894

Various cleanups for Elasticsearch test

def- pushed a commit to def-/jepsen that referenced this pull request May 24, 2023

run-jepsen: Add iterations option (jepsen-io#51)

0433e32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various cleanups for Elasticsearch test #51

Various cleanups for Elasticsearch test #51

dakrone commented Apr 7, 2015

dakrone commented Apr 8, 2015

aphyr Apr 15, 2015

dakrone Apr 16, 2015

dakrone commented Apr 17, 2015

aphyr commented Apr 17, 2015

aphyr commented Apr 28, 2015

dakrone commented Apr 28, 2015

Various cleanups for Elasticsearch test #51

Various cleanups for Elasticsearch test #51

Conversation

dakrone commented Apr 7, 2015

dakrone commented Apr 8, 2015

aphyr Apr 15, 2015

Choose a reason for hiding this comment

dakrone Apr 16, 2015

Choose a reason for hiding this comment

dakrone commented Apr 17, 2015

aphyr commented Apr 17, 2015

aphyr commented Apr 28, 2015

dakrone commented Apr 28, 2015