more detail on storage class #31420

geekpete · 2018-06-19T06:35:04Z

Additional detail on how storage class setting affects snapshot files.

elasticmachine · 2018-06-19T09:45:40Z

Pinging @elastic/es-distributed

geekpete · 2018-06-19T23:54:13Z

I only wonder if it's also good to provide a recommendation to set all files in a repository to one storage class, via the storage class setting initially and with AWS API and storage class setting if changing storage class at a later point. Rather than mix/match storage classes between different snapshots in the same repository as there's not much value in this mixed approach for perceived lifecycling patterns (keep all monthlies as standard class, subsequent inter-month snaps as reduced redundancy/IA,etc) due to the incremental nature of snapshot files relying on other files in the repository without much external visibility. So provide a general best practice tip on utilising the storage class setting as setting all files to the same class in a repo.

bleskes · 2018-06-20T11:19:49Z

@tlrx any opinions here?

tlrx · 2018-06-20T11:45:41Z

This changes looks good. Maybe we could add a similar precision fo canned_acl?

I agree on the fact that mixing storage classes within the same repository do not add a lot of value. I wish this setting was not exposed at all and that we recommend to use S3 Lifecycle Policies for setting the storage class. Maybe we can write a word about this, and emphasises that we recommand to use the same storage class for all files in the same repo?

geekpete · 2018-06-22T04:04:03Z

I added a bit about recommended usage of storage_class.

DaveCTurner

I'm not an enormous fan of this change (in isolation) for a handful of reasons:

we do not describe how we write objects to the repository, so after adding this I think users will still struggle to understand the implications of the various options for storage_class. I'm not sure if we want to describe the repository layout in enough detail to address this.
I do not think we should explicitly recommend setting all objects to the same storage class. It makes the most sense to me, but the choice of storage class is irrelevant to Elasticsearch, and I think users should feel free to set it to whatever they want according to their usage.
we should really call them "objects" rather than "files" in the context of S3.

DaveCTurner · 2018-06-25T08:13:26Z

Incidentally I just realised that lifecycle policies totally make sense on time-based indices, in contradiction to what I said in Slack.

geekpete · 2018-06-25T23:08:46Z

So these are all good points.
Do we need to just specify that this setting only affects newly created objects and leave it at that?

tlrx · 2018-06-26T08:04:50Z

we do not describe how we write objects to the repository, so after adding this I think users will still struggle to understand the implications of the various options for storage_class. I'm not sure if we want to describe the repository layout in enough detail to address this.

I don't think that we should describe how we write objects to the repository and I don't think it would help anyway. But I think it is valuable for users to understand the implication of changing the storage class on an existing repository, ie that the storage class is applied to new snapshotted files only. The explanation of incremental snapshots looks sufficient to me.

I do not think we should explicitly recommend setting all objects to the same storage class. It makes the most sense to me, but the choice of storage class is irrelevant to Elasticsearch, and I think users should feel free to set it to whatever they want according to their usage.

I agree that the choice of storage class is out of scope and I don't think this change recommend to use a specific storage class? I still think that we must indicate a) what is the default storage class, b) changing the setting on an existing repository only change the storage class for newly created objects resulting in a mixed usage of storage classes in the repo and c) users may prefer use Lifecycle Policies to manage storage classes.

DaveCTurner · 2018-06-26T08:30:57Z

I don't think this change recommend to use a specific storage class?

That's right, but it does recommend using the same storage class for all objects in the repo, and updating the storage class of existing objects via the API when changing the repo's storage class setting. I don't think we should do that.

I still think that we must indicate a) what is the default storage class, b) changing the setting on an existing repository only change the storage class for newly created objects resulting in a mixed usage of storage classes in the repo and c) users may prefer use Lifecycle Policies to manage storage classes.

+1

tlrx · 2018-06-26T08:32:50Z

That's right, but it does recommend using the same storage class for all objects in the repo, and updating the storage class of existing objects via the API when changing the repo's storage class setting. I don't think we should do that.

Oh I see. Yes I agree to on that point.

geekpete · 2018-06-27T13:57:16Z

Made some changes to the wording based on recent comments.

DaveCTurner

I suggested some minor tweaks. Also could you wrap the text so the lines are all about the same length?

DaveCTurner · 2018-07-31T07:07:04Z

docs/plugins/repository-s3.asciidoc

@@ -190,8 +190,12 @@ The following settings are supported:

 `storage_class`::

-    Sets the S3 storage class type for the backup files. Values may be
+    Sets the S3 storage class for snapshot repository objects. Values may be


I think I'd prefer "objects stored in the snapshot repository" instead of "snapshot repository objects"?

DaveCTurner · 2018-07-31T07:08:05Z

docs/plugins/repository-s3.asciidoc

    `standard`, `reduced_redundancy`, `standard_ia`. Defaults to `standard`.
+    All new objects created will use the set storage class.


I think this sentence can be removed - the subsequent sentence says this and more.

DaveCTurner · 2018-07-31T07:08:58Z

docs/plugins/repository-s3.asciidoc

    `standard`, `reduced_redundancy`, `standard_ia`. Defaults to `standard`.
+    All new objects created will use the set storage class.
+    Changing this setting on an existing repository only affects the storage class for newly created objects,
+    resulting in a mixed usage of storage classes. Alternatively, S3 Lifecycle Policies can be used 


"Alternatively" suggests you can't do both. "Additionally" would be better.

DaveCTurner · 2018-07-31T07:09:22Z

docs/plugins/repository-s3.asciidoc

+    All new objects created will use the set storage class.
+    Changing this setting on an existing repository only affects the storage class for newly created objects,
+    resulting in a mixed usage of storage classes. Alternatively, S3 Lifecycle Policies can be used 
+    to manage storage class.


I think it'd be better to say "the storage class of existing objects" instead of just "storage class" here.

geekpete · 2018-07-31T09:01:02Z

Further updates made based on suggestions.

DaveCTurner

LGTM, thanks @geekpete

tlrx

Thanks!

DaveCTurner · 2018-07-31T11:39:21Z

This branched off master some time ago so I wouldn't be surprised if this build fails. If it does, merging in the latest master should fix it.

geekpete · 2018-08-01T02:25:35Z

@elasticmachine retest this please

DaveCTurner · 2018-08-01T07:01:27Z

It's unlikely that retesting this will lead to a green build. You need to merge the latest master since we've released versions since this branch was cut, so the list of released versions is out of date, causing this:

03:17:18 Execution failed for task ':distribution:bwc:next-bugfix-snapshot:buildBwcVersion'.
03:17:18 > Building bwc version didn't generate expected files [/var/lib/jenkins/workspace/elastic+elasticsearch+pull-request/distribution/bwc/next-bugfix-snapshot/build/bwc/checkout-6.3/distribution/archives/oss-zip/build/distributions/elasticsearch-oss-6.3.1-SNAPSHOT.zip, /var/lib/jenkins/workspace/elastic+elasticsearch+pull-request/distribution/bwc/next-bugfix-snapshot/build/bwc/checkout-6.3/distribution/archives/zip/build/distributions/elasticsearch-6.3.1-SNAPSHOT.zip, /var/lib/jenkins/workspace/elastic+elasticsearch+pull-request/distribution/bwc/next-bugfix-snapshot/build/bwc/checkout-6.3/distribution/packages/oss-deb/build/distributions/elasticsearch-oss-6.3.1-SNAPSHOT.deb, /var/lib/jenkins/workspace/elastic+elasticsearch+pull-request/distribution/bwc/next-bugfix-snapshot/build/bwc/checkout-6.3/distribution/packages/deb/build/distributions/elasticsearch-6.3.1-SNAPSHOT.deb, /var/lib/jenkins/workspace/elastic+elasticsearch+pull-request/distribution/bwc/next-bugfix-snapshot/build/bwc/checkout-6.3/distribution/packages/oss-rpm/build/distributions/elasticsearch-oss-6.3.1-SNAPSHOT.rpm, /var/lib/jenkins/workspace/elastic+elasticsearch+pull-request/distribution/bwc/next-bugfix-snapshot/build/bwc/checkout-6.3/distribution/packages/rpm/build/distributions/elasticsearch-6.3.1-SNAPSHOT.rpm]

It seems that this statement is a debug leftover since it currently adds an error message `{"jobs":[]}` after each successful REST test.

When dealing with a null group, the associated metric aggs need to return null as well Fix #34896

* Removed methods are just unused (the exceptions being isGeoPoint() and is isFloatingPoint() but those could more efficiently be replaced by enum comparisons to simplify the code) * Remove exceptions aren't thrown

Clarifies the width rules for snippets.

we are restructuring the docs, this migrates ILM docs outside of the x-pack doc structure.

`AbstractComponent` is trouble because its name implies that *everything* should extend from it. It *is* useful, but maybe too broadly useful. The things it offers access too, the `Settings` instance for the entire server and a logger are nice to have around, but not really needed *everywhere*. The `Settings` instance especially adds a fair bit of ceremony to testing without any value. This removes the `nodeName` method from `AbstractComponent` so it is more clear where we actually need the node name.

Drops the `deprecationLogger` from `AbstractComponent`, moving it to places where we need it. This saves us from building a bunch of `DeprecationLogger`s that we don't need. Relates to #34488

Sometimes users are confused about whether they can use the Convert Processor for changing an existing fields type to other types even if the existing one is already ingested. This confusion is from the first line of description. Changing this and also adding a some detail to the code snippet.

* Moved `AcknowledgedResponse` to core package * Made AcknowledgedResponse not abstract and provided a default parser, so that in cases when the field name is not overwritten then there is no need for a subclass. Relates to #33824

* Consolidate the name of the qualified build version * Field name in response should not be redundant

Today we only apply `ingore_throttled` to expansions from wildcards, date math expressions and aliases. Yet, this is tricky since we might have resolved certain expressions in pre-filter steps like security. It's more consistent to apply this logic to all expressions including concrete indices. Relates to #34354

* ML: Adding missing data check class * reverting bad change * Adding bucket + missing data object for returns * reverting unnecessary change * adding license header * Make client calls synchronous, akin to DatafeedJob * Fixing line length * Renaming things, addressing PR comments

Adds basic rolling upgrade tests to check that lifecycles are still recognizable between rolling cluster upgrades and managed indices stay managed. This is a placeholder for discussing types of checks so they are ready once we backported This is a re-boot of the previous PR against index-lifecycle that needed to be reverted due to CI bwc issues. #32828

An auto follow pattern: * cannot start with `_` * cannot contain a `,` * can be encoded in UTF-8 * the length of UTF-8 encoded bytes is no longer than 255 bytes

removes fake allocation id after recovery is done Relates to #33432

This change adds a `frozen` engine that allows lazily open a directory reader on a read-only shard. The engine wraps general purpose searchers in a LazyDirectoryReader that also allows to release and reset the underlying index readers after any and before secondary search phases. Relates to #34352

…query (#35160) Today when a percolator query contains a date range then the query analyzer extracts that range, so that at search time the `percolate` query can exclude percolator queries efficiently that are never going to match. The problem is that if 'now' is used it is evaluated at index time. So the idea is to rewrite date ranges with 'now' to a match all query, so that the query analyzer can't extract it and the `percolate` query is then able to evaluate 'now' at query time.

* [ILM] Check shard and relocation status in AllocationRoutedStep This is a follow-up from #35161 where we now check for started and relocating state in `AllocationRoutedStep`. Resolves #35258

A CCR test failure shows that the approach in #34474 is flawed. Restoring the LocalCheckpointTracker from an index commit can cause both FollowingEngine and InternalEngine to incorrectly ignore some deletes. Here is a small scenario illustrating the problem: 1. Delete doc with seq=1 => engine will add a delete tombstone to Lucene 2. Flush a commit consisting of only the delete tombstone 3. Index doc with seq=0 => engine will add that doc to Lucene but soft-deleted 4. Restart an engine with the commit (step 2); the engine will fill its LocalCheckpointTracker with the delete tombstone in the commit 5. Replay the local translog in reverse order: index#0 then delete#1 6. When process index#0, an engine will add it into Lucene as a live doc and advance the local checkpoint to 1 (seq#1 was restored from the commit - step 4). 7. When process delete#1, an engine will skip it because seq_no=1 is less than or equal to the local checkpoint. We should have zero document after recovering from translog, but here we have one. Since all operations after the local checkpoint of the safe commit are retained, we should find them if the look-up considers also soft-deleted documents. This PR fills the disparity between the version map and the local checkpoint tracker by taking soft-deleted documents into account while resolving strategy for engine operations. Relates #34474 Relates #33656

Since it's still possible to shrink an index when replicas are unassigned, we should not check that all copies are available when performing the shrink, since we set the allocation requirement for a single node. Resolves #35321

#35271) Ensure that Watcher is correctly started and stopped between tests for SmokeTestWatcherWithSecurityIT, SmokeTestWatcherWithSecurityClientYamlTestSuiteIT, SmokeTestWatcherTestSuiteIT, WatcherRestIT, XDocsClientYamlTestSuiteIT, and XPackRestIT The change here is to throw an `AssertionError` instead of `break;` to allow the `assertBusy()` to continue to busy wait until the desired state is reached. closes #33291, closes #29877, closes #34462, closes #30705, closes #34448

The lookup vars under params (namely _fields and _source) were inadvertently removed when scoring scripts were converted to using script contexts. This commit adds them back, along with deprecation warnings for those that should not be used.

Additional detail on how storage class setting affects snapshot files.

cbuescher · 2018-11-08T10:01:16Z

@geekpete @dct something went wrong with the master merge it seems. Is there any way to recover and maybe merge this?

geekpete · 2018-11-08T13:10:27Z

At this point I'd almost just do a new PR against recent master and just close this one and reference it from the new one, unless there's some force rebase/fix I can do to get this cleaned up...

cbuescher · 2018-11-08T15:30:54Z

At this point I'd almost just do a new PR against recent master

+1 to that, if you have the last version of the PR still around.

A clean PR from original reference PR with discussion here: #31420

geekpete · 2018-11-09T04:32:00Z

Closed in favour of #35400

colings86 added >docs General docs changes :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Jun 19, 2018

DaveCTurner reviewed Jun 25, 2018

View reviewed changes

DaveCTurner reviewed Jul 31, 2018

View reviewed changes

DaveCTurner approved these changes Jul 31, 2018

View reviewed changes

tlrx approved these changes Jul 31, 2018

View reviewed changes

imotov and others added 9 commits October 26, 2018 14:23

Tests: remove possibly unnecessary rollup job logging (#34883)

02a342e

It seems that this statement is a debug leftover since it currently adds an error message `{"jobs":[]}` after each successful REST test.

SQL: Handle aggregation for null group (#34916)

55d48e4

When dealing with a null group, the associated metric aggs need to return null as well Fix #34896

MINOR: Remove Deadcode in aggregtions.support (#34323)

64a0442

* Removed methods are just unused (the exceptions being isGeoPoint() and is isFloatingPoint() but those could more efficiently be replaced by enum comparisons to simplify the code) * Remove exceptions aren't thrown

Contributing: Clarify width of snippets (#34910)

5e0b524

Clarifies the width rules for snippets.

remove old doc placeholder and migrate ilm docs to top-level (#34615)

e737ea7

we are restructuring the docs, this migrates ILM docs outside of the x-pack doc structure.

Drop deprecationLogger from AbstractComponent (#34859)

9f87fdc

Drops the `deprecationLogger` from `AbstractComponent`, moving it to places where we need it. This saves us from building a bunch of `DeprecationLogger`s that we don't need. Relates to #34488

Merge remote-tracking branch 'upstream/master' into index-lifecycle

d8322ca

[ILM] Convert Explain request to new HLRC classes (#34922)

8ce31f6

insukcho and others added 22 commits November 7, 2018 17:01

[CCR] Added HLRC support for pause follow API (#35216)

021f805

* Moved `AcknowledgedResponse` to core package * Made AcknowledgedResponse not abstract and provided a default parser, so that in cases when the field name is not overwritten then there is no need for a subclass. Relates to #33824

Make version field names more meaningful (#35334)

5ae0319

* Consolidate the name of the qualified build version * Field name in response should not be redundant

[CCR] Enforce auto follow pattern name restrictions (#35197)

314b9ca

An auto follow pattern: * cannot start with `_` * cannot contain a `,` * can be encoded in UTF-8 * the length of UTF-8 encoded bytes is no longer than 255 bytes

Put a fake allocation id on allocate stale primary command (#34140)

f789d49

removes fake allocation id after recovery is done Relates to #33432

[ILM] Check shard and relocation status in AllocationRoutedStep (#35316)

c51dcb1

* [ILM] Check shard and relocation status in AllocationRoutedStep This is a follow-up from #35161 where we now check for started and relocating state in `AllocationRoutedStep`. Resolves #35258

Remove ALL shard check in CheckShrinkReadyStep (#35346)

b1655c4

Since it's still possible to shrink an index when replicas are unassigned, we should not check that all copies are available when performing the shrink, since we set the allocation requirement for a single node. Resolves #35321

more detail on storage class

72171d1

Additional detail on how storage class setting affects snapshot files.

Provide recommended usage of storage_class

3a24023

Update repository-s3.asciidoc

c107b1c

Update repository-s3.asciidoc

282d51e

Update repository-s3.asciidoc

9dabbeb

more detail on storage class

0926040

Additional detail on how storage class setting affects snapshot files.

more detail on storage class for s3 repo

30dee88

geekpete added a commit that referenced this pull request Nov 9, 2018

More detail for storage class

e17938b

A clean PR from original reference PR with discussion here: #31420

geekpete mentioned this pull request Nov 9, 2018

More detail for storage class #35400

Merged

geekpete closed this Nov 9, 2018

geekpete deleted the geekpete-patch-5 branch November 30, 2018 05:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

more detail on storage class #31420

more detail on storage class #31420

geekpete commented Jun 19, 2018

elasticmachine commented Jun 19, 2018

geekpete commented Jun 19, 2018 •

edited

Loading

bleskes commented Jun 20, 2018

tlrx commented Jun 20, 2018

geekpete commented Jun 22, 2018

DaveCTurner left a comment

DaveCTurner commented Jun 25, 2018

geekpete commented Jun 25, 2018

tlrx commented Jun 26, 2018

DaveCTurner commented Jun 26, 2018

tlrx commented Jun 26, 2018

geekpete commented Jun 27, 2018

DaveCTurner left a comment

DaveCTurner Jul 31, 2018

DaveCTurner Jul 31, 2018

DaveCTurner Jul 31, 2018

DaveCTurner Jul 31, 2018

geekpete commented Jul 31, 2018

DaveCTurner left a comment

tlrx left a comment

DaveCTurner commented Jul 31, 2018

geekpete commented Aug 1, 2018

DaveCTurner commented Aug 1, 2018

cbuescher commented Nov 8, 2018

geekpete commented Nov 8, 2018

cbuescher commented Nov 8, 2018

geekpete commented Nov 9, 2018

		`standard`, `reduced_redundancy`, `standard_ia`. Defaults to `standard`.
		All new objects created will use the set storage class.

more detail on storage class #31420

more detail on storage class #31420

Conversation

geekpete commented Jun 19, 2018

elasticmachine commented Jun 19, 2018

geekpete commented Jun 19, 2018 • edited Loading

bleskes commented Jun 20, 2018

tlrx commented Jun 20, 2018

geekpete commented Jun 22, 2018

DaveCTurner left a comment

Choose a reason for hiding this comment

DaveCTurner commented Jun 25, 2018

geekpete commented Jun 25, 2018

tlrx commented Jun 26, 2018

DaveCTurner commented Jun 26, 2018

tlrx commented Jun 26, 2018

geekpete commented Jun 27, 2018

DaveCTurner left a comment

Choose a reason for hiding this comment

DaveCTurner Jul 31, 2018

Choose a reason for hiding this comment

DaveCTurner Jul 31, 2018

Choose a reason for hiding this comment

DaveCTurner Jul 31, 2018

Choose a reason for hiding this comment

DaveCTurner Jul 31, 2018

Choose a reason for hiding this comment

geekpete commented Jul 31, 2018

DaveCTurner left a comment

Choose a reason for hiding this comment

tlrx left a comment

Choose a reason for hiding this comment

DaveCTurner commented Jul 31, 2018

geekpete commented Aug 1, 2018

DaveCTurner commented Aug 1, 2018

cbuescher commented Nov 8, 2018

geekpete commented Nov 8, 2018

cbuescher commented Nov 8, 2018

geekpete commented Nov 9, 2018

geekpete commented Jun 19, 2018 •

edited

Loading