Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot extend an already-expired lock crashes consumer pods. #283

Open
doramar97 opened this issue Jul 20, 2023 · 2 comments
Open

Cannot extend an already-expired lock crashes consumer pods. #283

doramar97 opened this issue Jul 20, 2023 · 2 comments

Comments

@doramar97
Copy link

We are using 1 Node of Elasticache for Redis on our Production environment.
Engine version - 6.2.6,
Node type - cache.t3.micro.

The environment is implemented on EKS, consumers are pods on the cluster and each pod handles one task at a time. (We also use AmazonMQ - Rabbit for handling tasks).

We are using redlock to lock a process if another process is already running using the customer - which means as long as a task running on the specific customer is executing, no other task regarding the specific customer can be executed and goes to a another queue that handles delayed messages .

Our issue is with long running tasks or multiple tasks addressing the same customer. Getting the following errors which causes the pod to restart.

[redlock] error while executing lock block function. Cannot extend an already-expired lock. 
[redlock] error while executing lock block function.
redlock = new Redlock([redisClient], {
    // The expected clock drift; for more details see:
    // http://redis.io/topics/distlock
    driftFactor: 0.01, // multiplied by lock ttl to determine drift time

    // The max number of times Redlock will attempt to lock a resource
    // before erroring.
    retryCount: 10,

    // the time in ms between attempts
    retryDelay: 500, // time in ms

    // the max time in ms randomly added to retries
    // to improve performance under high contention
    // see https://www.awsarchitectureblog.com/2015/03/backoff.html
    retryJitter: 200, // time in ms

    // The minimum remaining time on a lock before an extension is automatically
    // attempted with the `using` API.
    automaticExtensionThreshold: 500, // time in ms
  });

Will be happy to provide some more context or code, we are also setting lockDuration: number = 2000 in a function that checks if a block is locked.

Will be happy to get any kind of help and guidance towards this issue, or the best practices to our use case,
Thanks !

@raimoa1
Copy link

raimoa1 commented Feb 7, 2024

We found that this code https://github.com/mike-marcacci/node-redlock/blob/main/src/index.ts#L436 while loop creates an infinite loop and crashes the server. We found that there is a mem leak. Interestingly it’s caused when there are parallel requests going on. Somehow the redlock can't cope with that and gets stuck in the infinite loop.

@bwright2810
Copy link

This issue appears to be occurring to us as well in our AWS Lambda executions. We had an API request that was taking longer than expected during peak times and as a result it was out lasting the specified redlock lock time. When we went to extend the lock after the API request, the Lambda mysteriously imploded with an "UnknownApplicationError" that was not getting caught in our error handling block. It looks like this issue is what was happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants