-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snapshot : azure module - accelerate the listing of files (used in delete snapshot) #25710
Snapshot : azure module - accelerate the listing of files (used in delete snapshot) #25710
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
@karmi , My company signed a CLA ("A company wanting its employees to contribute") with my Github account in the annex ". Do I have to sign an other one (" An individual contributing under an existing company contributor agreement") ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I left some comments. In general it looks like the idea is to call a different listBlobs impl which can return all the metadata of the keys along with the key itself. Can you please update the description to explain that?
CloudBlobClient client = this.getSelectedClient(account, mode); | ||
CloudBlobContainer blobContainer = client.getContainerReference(container); | ||
|
||
SocketAccess.doPrivilegedVoidException(() -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doPrivileged needs to stay.
// uri.getPath is of the form /container/keyPath.* and we want to strip off the /container/ | ||
// this requires 1 + container.length() + 1, with each 1 corresponding to one of the / | ||
String blobPath = uri.getPath().substring(1 + container.length() + 1); | ||
if (!(blobItem instanceof CloudBlockBlob)){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this check, it should be an error if we don't get back CloudBlockBlob (which is what we expect for all keys here). I think it can be a hard assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it is the assumption on the previous version of code but I prefer to be defensive : if the azure storage container has been modified by an outside component, it could have other type of blob (Append/Page) so I propose ignoring it and put a warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am +1 on what @rjernst said - silently ignoring is not the way to go here, very few read log messages, its more important to fail hard if something changed that alters our underlying assumptions of what object type is returned there.
logger.trace("blob url [{}], name [{}], size [{}]", uri, name, properties.getLength()); | ||
blobsBuilder.put(name, new PlainBlobMetaData(name, properties.getLength())); | ||
if (blobContainer.exists()) { | ||
for (ListBlobItem blobItem : blobContainer.listBlobs(keyPath + (prefix == null ? "" : prefix),false,enumBlobListingDetails,null,null)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please use spaces after the commas
Also, check the length of this line, it looks close to the 140 char limit
*/ | ||
String testName = "snapshot-itest-" | ||
.concat(RandomizedTest.getContext().getRunnerSeedAsString().toLowerCase(Locale.ROOT)) | ||
.concat(new Long(RandomizedTest.getContext().getRandom().nextLong()).toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can simply be randomLong()
. However, I don't think this will work because the cleanup in wipeAzureRepositories()
would then not know about the generated name. If these need to be unique, then this method needs to stash the generated name so that the repository can be cleaned up after the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right : I rollback as it will not work in wipeAzureRepository .
return testName.contains(" ") ? Strings.split(testName, " ")[0] : testName; | ||
} | ||
|
||
@Override | ||
public Settings indexSettings() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these should be removed, but if they should, then please explain why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reflexion, ok to keep it but we have to add the MockFSIndexStore.TestPlugin so that the configuration variable are known. If we don't have, we have random error as MockFSIndexStore.TestPlugin is added randomly in ESIntegTestCase superclass
@etiennecarriere Regarding the CLA, I believe you need to associate your github profile with the company signed CLA. If you follow the link to the CLA, I think that is the second option |
Hi @etiennecarriere, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in yout Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile? |
6ad3fb6
to
e6478a6
Compare
@karmi, my fault, I commited the corrections with my personal account and not my professional account. |
@Etienne-Carriere, the check is now green, thanks for fixing the e-mail problem! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Etienne-Carriere Thanks for the updates and sorry for the delay in reviewing again. I left a couple more minor comments. If you can address them, I will merge this.
SocketAccess.doPrivilegedVoidException(() -> { | ||
if (blobContainer.exists()) { | ||
for (ListBlobItem blobItem : blobContainer.listBlobs(keyPath + (prefix == null ? "" : prefix))) { | ||
for (ListBlobItem blobItem : blobContainer.listBlobs(keyPath + (prefix==null ? "" : prefix),false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please keep the spacing around ==
and between arguments (after ,
)
private String getRepositoryPath() { | ||
String testName = "it-" + getTestName(); | ||
return testName.contains(" ") ? Strings.split(testName, " ")[0] : testName; | ||
} | ||
|
||
public static String getContainerName() { | ||
String testName = "snapshot-itest-".concat(RandomizedTest.getContext().getRunnerSeedAsString().toLowerCase(Locale.ROOT)); | ||
/* Have a different name per test so that there is no possible race condition. As the long can be negative, | ||
* there mustn't be an hyphen between the 2 concat numbers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: an -> a, concat -> concatenated
String testName = "snapshot-itest-".concat(RandomizedTest.getContext().getRunnerSeedAsString().toLowerCase(Locale.ROOT)); | ||
/* Have a different name per test so that there is no possible race condition. As the long can be negative, | ||
* there mustn't be an hyphen between the 2 concat numbers | ||
* (can't have 2 consecutives hypens on Azure containers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: hypens -> hyphens
Changes made. For the resolution of the conflicts, I don't know what is the Elasticsearch method :
|
@elasticmachine test this please |
Thanks @Etienne-Carriere, I will get this merged once CI completes. |
@Etienne-Carriere This branch no longer compiles against master. Can you bring it up to date? I think the ctor for azure impl has been changed. |
I change the API call . I made the following tests : but the gradle check the root directory failed on an error that seems not to be linked with this PR. |
@elasticmachine please test this |
@rjernst , do you need additionnals elements to validate the PR ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for your patience @Etienne-Carriere
…rflow * origin/master: (59 commits) Fix Lucene version of 5.6.1. Remove azure deprecated settings (elastic#26099) Handle the 5.6.0 release Allow plugins to validate cluster-state on join (elastic#26595) Remove index mapper dynamic settings (elastic#25734) update AWS SDK for ECS Task IAM support in discovery-ec2 (elastic#26479) Azure repository: Accelerate the listing of files (used in delete snapshot) (elastic#25710) Build: Remove norelease from forbidden patterns (elastic#26592) Fix reference to painless inside expression engine (elastic#26528) Build: Move javadoc linking to root build.gradle (elastic#26529) Test: Remove leftover static bwc test case (elastic#26584) Docs: Remove remaining references to file and native scripts (elastic#26580) Snapshot fallback should consider build.snapshot elastic#26496: Set the correct bwc version after backport to 6.x Fix the MapperFieldType.rangeQuery API. (elastic#26552) Deduplicate `_field_names`. (elastic#26550) [Docs] Update method setSource(byte[] source) (elastic#26561) [Docs] Fix typo in javadocs (elastic#26556) Allow multiple digits in Vagrant 2.x minor versions Support Vagrant 2.x ...
* master: (21 commits) Ensure module is bundled before installing in tests Add boolean similarity to built in similarity types (elastic#26613) [Tests] Remove skip tests in search/30_limits.yml Let search phases override max concurrent requests Add a soft limit for the number of requested doc-value fields (elastic#26574) Support for accessing Azure repositories through a proxy (elastic#23518) Add beta tag to MSI Windows Installer (elastic#26616) Fix Lucene version of 5.6.1. Remove azure deprecated settings (elastic#26099) Handle the 5.6.0 release Allow plugins to validate cluster-state on join (elastic#26595) Remove index mapper dynamic settings (elastic#25734) update AWS SDK for ECS Task IAM support in discovery-ec2 (elastic#26479) Azure repository: Accelerate the listing of files (used in delete snapshot) (elastic#25710) Build: Remove norelease from forbidden patterns (elastic#26592) Fix reference to painless inside expression engine (elastic#26528) Build: Move javadoc linking to root build.gradle (elastic#26529) Test: Remove leftover static bwc test case (elastic#26584) Docs: Remove remaining references to file and native scripts (elastic#26580) Snapshot fallback should consider build.snapshot ...
Close #25424 .
Current behaviour of the Listing of a container in azure plugin :
=> it is ineffective as we have N+1 requests for N blobs
Proposed behaviour