Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

globPattern does not work with hierarchical namespace while using azure-blob trigger #6492

Open
bene-tleilax-werdna opened this issue Jan 21, 2025 · 2 comments

Comments

@bene-tleilax-werdna
Copy link

When attempting to use the azure-blob trigger on an Azure Data Lake Storage account leveraging a hierarchical namespace, it seems the globPattern parameter does not work.

Example, say your storage account is adlsaccount , container is container and you have directory structure such as:

folderA/
├─ subFolderA/
│  ├─ part-fileA.json
│  ├─ part-fileB.json
│  ├─ _METADATA_FILE
├─ subFolderB/
│  ├─ fileD.json
folderB/
├─ subFolderC/
│  ├─ fileE.txt

I want the scaler to trigger only if the scaler detects a *.json file in folderA/subFolderA.

According to the documentation, I should be able to use something like this:

triggers:
- type: azure-blob
  metadata:
    blobContainerName: container
    activationBlobCount: "1"
    blobCount: "5"
    connectionFromEnv: CONNECTION_STRING_ENV_NAME
    accountName: adlsaccount
    globPattern: "/folderA/subFolderA/*.json"

However, that does not work and it does not appear that the keda-operator logs indicate any reason why - the scaler will be healthy but will not activate. Using a double-asterisk makes no impact either.

We were able to work around this limitation with the common file prefix on the desired json files, as demonstrated below:

triggers:
- type: azure-blob
  metadata:
    blobContainerName: container
    activationBlobCount: "1"
    blobCount: "5"
    connectionFromEnv: CONNECTION_STRING_ENV_NAME
    accountName: adlsaccount
    blobPrefix: folderA/subFolderA/part-
    blobDelimiter: "/"
    recursive: "true"

That work-around works fine in my use-case but ideally we would want to leverage the ability to use glob patterns to match the desired blob files for which we want the scaler trigger to be invoked.

It looks like the glob pattern is not getting applied in the GetAzureBlobListLength function defined here: https://github.com/kedacore/keda/blob/6340991f9178912d6daf73766a0e50e6b438b212/pkg/scalers/azure/azure_blob.go

Someone more knowledgeable than I may have an idea of how best to resolve.

@rickbrouwer
Copy link
Contributor

I haven't dug into it very deeply yet, but would you mind testing this first:

globPattern: "folderA/subFolderA/*.json"

So, without de leading slash

@bene-tleilax-werdna
Copy link
Author

Of course, I think I tried that at some point when I was trying to figure out what the issue was originally along with several other iterations.

I just tested to confirm the behavior as requested though, and unfortunately it results in the same behavior as initially described.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: To Triage
Development

No branches or pull requests

2 participants