Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improvement: capture and retry database concurrent update errors #5227

Conversation

jfcalvo
Copy link
Member

@jfcalvo jfcalvo commented Jul 15, 2024

Description

After investigate timeouts for PostgreSQL I have found that timeouts should not affect errors when a SERIALIZABLE transactions is rollbacked because another concurrent update error is raised.

So the only way to support concurrent updates with PostgreSQL and SERIALIZABLE transactions is to capture errors and retry the transaction.

This code has the following changes:

  • Start using backoff library to retry any of the possible CRUD context functions updating responses and record statuses, using SERIALIZABLE database sessions.
  • This change has the side effect of working with PostgreSQL and SQLite at the same time.
  • I have set a fixed time of 15 seconds as maximum time for retrying with exponential backoff.
  • I have moved search engine updates outside of the transaction block.
  • This should mitigate errors on high concurrency scenarios for PostgreSQL and SQLite:
    • For SQLite we have the additional setting to set a timeout if necessary.
  • I have changed DEFAULT_DATABASE_SQLITE_TIMEOUT value to 5 seconds so the backoff logic will handle possible problems with locked database errors and SQLite.

Refs #5000

Type of change

  • Improvement (change adding some improvement to an existing functionality)

How Has This Been Tested

  • Manually testing with PostgreSQL and SQLite, running benchmarks using 20 concurrent requests.
  • Running test suite for PostgreSQL and SQLite.

Checklist

  • I added relevant documentation
  • I followed the style guidelines of this project
  • I did a self-review of my code
  • I made corresponding changes to the documentation
  • I confirm My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

@jfcalvo jfcalvo requested a review from frascuchon July 15, 2024 14:43
Copy link

codecov bot commented Jul 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.61%. Comparing base (f77341e) to head (15950ab).

Additional details and impacted files
@@                              Coverage Diff                              @@
##           feat/add-dataset-automatic-task-distribution    #5227   +/-   ##
=============================================================================
  Coverage                                         90.60%   90.61%           
=============================================================================
  Files                                               137      137           
  Lines                                              5746     5752    +6     
=============================================================================
+ Hits                                               5206     5212    +6     
  Misses                                              540      540           
Flag Coverage Δ
argilla-server 90.61% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jfcalvo jfcalvo merged commit b456600 into feat/add-dataset-automatic-task-distribution Jul 16, 2024
6 checks passed
@jfcalvo jfcalvo deleted the improvement/capture-and-retry-concurrent-update-errors branch July 16, 2024 07:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants