-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce StorageConnector for Azure #14660
Introduce StorageConnector for Azure #14660
Conversation
extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureStorage.java
Fixed
Show resolved
Hide resolved
import java.util.concurrent.atomic.AtomicBoolean; | ||
import java.util.concurrent.atomic.AtomicLong; | ||
|
||
public abstract class ChunkingStorageConnector<T> implements StorageConnector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please java doc this since this is the crux of this PR .
...re-extensions/src/test/java/org/apache/druid/storage/azure/output/AzureOutputConfigTest.java
Fixed
Show fixed
Hide fixed
...ure-extensions/src/test/java/org/apache/druid/storage/azure/output/AzureOutputSerdeTest.java
Fixed
Show fixed
Hide fixed
public ChunkingStorageConnectorParameters<T> build() | ||
{ | ||
Preconditions.checkArgument(start >= 0, "'start' not provided or an incorrect value [%s] passed", start); | ||
Preconditions.checkArgument(end >= 0, "'end' not provided or an incorrect value [%s] passed", end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would end < start return a good error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated a check with this as well in the PR!
{ | ||
private static final long DOWNLOAD_MAX_CHUNK_SIZE_BYTES = 100_000_000; | ||
|
||
public ChunkingStorageConnector() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted the change so that the individual connectors can control the chunk sizes. Used primarily for testing for now, though this can be extended to the real implementations as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall
params.getMaxRetry() | ||
), | ||
outFile, | ||
new byte[8 * 1024], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this code was only moved, but could you add a comment on why these numbers are chosen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM. The user facing docs are remaining.
.../azure-extensions/src/main/java/org/apache/druid/storage/azure/output/AzureOutputConfig.java
Fixed
Show fixed
Hide fixed
Thanks, @adarshsanjeev @cryptoe for the reviews and @dhananjay1308 for testing the changes out on a cluster. |
Description
This PR adds the storage connector to interact with Azure's blob storage using the current Azure API used in Druid. This will allow Durable storage and MSQ's interactive APIs to work with Azure
This also refactors the currently available S3 connector so that the chunking downloads that is currently done by the S3 connector can be extended to other connectors. (note: This refactoring is ported from the PR #14611 since that is currently parked for work).
Testing plan
Release note
Azure connector has been introduced and MSQ's fault tolerance and durable storage can now be used with Microsoft Azure's blob storage. Also the results of newly introduced queries from deep storage can now store and fetch the results from the Azure's blob storage.
Key changed/added classes in this PR
This PR has: