Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving storage performance #1659

Merged
merged 8 commits into from
Aug 3, 2020

Conversation

GitHK
Copy link
Contributor

@GitHK GitHK commented Jul 30, 2020

What do these changes do?

The list_files function is now optimised to run with big datasets, and no longer depends on third party S3 calls when listing files in the storage service.

Removed

  • database updates when retrieving the real file names (which might have been renamed) => responsible for 40% of the slowdown
  • file metadata updates in the database after invoking an S3 call for each file when listing all the files => responsible for the majority of the reminding slowdown

Added

  • a metadata_file_updater launched before returning the upload link for each file to be uploaded. It will try to update the metadata for each file. It uses an exponential backoff retry to account for huge file uploads. It also has a maximum amount of possible retries. If it fails to update the metadata, an error message will be logged.

Changes

Related issue number

Closes #1559

How to test

Start the project and make sure to have lots of files in your storage. With the old implementation performances degraded after 200 uploaded files.

Checklist

  • Did you change any service's API? Then make sure to bundle document and upgrade version (make openapi-specs, git commit ... and then make version-*)
  • Unit tests for the changes exist
  • Runs in the swarm
  • Documentation reflects the changes
  • New module? Add your github username to .github/CODEOWNERS

Andrei Neagu added 4 commits July 30, 2020 15:33
the metadata will be synced by a background worker spawned for
for each file when a new upload url is generated
@GitHK GitHK added t:enhancement Improvement or request on an existing feature a:storage issue related to storage service labels Jul 30, 2020
@GitHK GitHK added this to the Xie-Xie milestone Jul 30, 2020
@GitHK GitHK self-assigned this Jul 30, 2020
@GitHK GitHK marked this pull request as draft July 30, 2020 13:54
@GitHK GitHK changed the title Improving storage performance WIP: Improving storage performance Jul 30, 2020
@codecov
Copy link

codecov bot commented Jul 30, 2020

Codecov Report

Merging #1659 into master will increase coverage by 5.4%.
The diff coverage is 72.1%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #1659     +/-   ##
========================================
+ Coverage    68.2%   73.6%   +5.4%     
========================================
  Files         256     278     +22     
  Lines        9442   10918   +1476     
  Branches     1010    1179    +169     
========================================
+ Hits         6442    8046   +1604     
+ Misses       2777    2529    -248     
- Partials      223     343    +120     
Flag Coverage Δ
#integrationtests 56.7% <ø> (?)
#unittests 67.4% <72.1%> (-0.9%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...vices/storage/src/simcore_service_storage/utils.py 69.5% <62.5%> (ø)
...ervices/storage/src/simcore_service_storage/dsm.py 65.0% <73.5%> (ø)
...es/storage/src/simcore_service_storage/settings.py 100.0% <0.0%> (ø)
...ces/storage/src/simcore_service_storage/datcore.py 15.4% <0.0%> (ø)
...s/storage/src/simcore_service_storage/db_tokens.py 90.0% <0.0%> (ø)
...s/storage/src/simcore_service_storage/resources.py 100.0% <0.0%> (ø)
...age/src/simcore_service_storage/datcore_wrapper.py 56.1% <0.0%> (ø)
...storage/src/simcore_service_storage/application.py 46.6% <0.0%> (ø)
...ervices/storage/src/simcore_service_storage/cli.py 75.7% <0.0%> (ø)
...es/storage/src/simcore_service_storage/__init__.py 100.0% <0.0%> (ø)
... and 44 more

@GitHK GitHK requested review from mguidon, odeimaiz and sanderegg July 30, 2020 14:55
@GitHK GitHK marked this pull request as ready for review July 30, 2020 14:55
@GitHK GitHK changed the title WIP: Improving storage performance Improving storage performance Jul 30, 2020
Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use tenacity for the exponential backoff, and it works through a decorator:
https://github.com/jd/tenacity

Comment on lines +427 to +428
Will retry max_update_retries to update the metadata on the file after an upload.
If it is not successfull it will exit and log an error.
Copy link
Member

@sanderegg sanderegg Jul 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use tenacity here? I think most of these features are available in the @Retry decorator.

stuff like this:

@retry(wait=wait_exponential(multiplier=1, min=4, max=10))
    def wait_exponential_1():
        print("Wait 2^x * 1 second between each retry starting with 4 seconds, then up to 10 seconds, then 10 seconds afterwards")
        raise Exception

Copy link
Contributor Author

@GitHK GitHK Jul 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it was too generic for my use case. I needed to sleep only for a subsection of the function

Edit: I have skipped tenacity on purpose here. Also the exponential backoff generator implementation comes from the backoff library.

services/storage/src/simcore_service_storage/dsm.py Outdated Show resolved Hide resolved
services/storage/src/simcore_service_storage/dsm.py Outdated Show resolved Hide resolved
- using logging ensure_future
- freformatted code
@GitHK GitHK requested a review from sanderegg July 30, 2020 15:35
Copy link
Member

@odeimaiz odeimaiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Demo worked like a charm.

@odeimaiz odeimaiz merged commit f0e11ca into ITISFoundation:master Aug 3, 2020
@odeimaiz odeimaiz mentioned this pull request Aug 4, 2020
odeimaiz added a commit that referenced this pull request Aug 4, 2020
- UI/UX improvements (#1657)
- Bump yarl from 1.4.2 to 1.5.1 in /packages/postgres-database (#1665)
- Bump ujson from 3.0.0 to 3.1.0 in /packages/service-library (#1664)
- Bump pytest-docker from 0.7.2 to 0.8.0 in /packages/service-library (#1647)
- Improving storage performance (#1659)
- Bump aiozipkin from 0.6.0 to 0.7.0 in /packages/service-library (#1642)
- Theming (#1656)
- Platform stability:  (#1645)
- is1594 fix and re-activate e2e testing (#1620)
- 2 bugs fixed + Some improvements (#1634)
- Fixes default (#1640)
- Bump lodash from 4.17.15 to 4.17.19 (#1639)
- Is1585/cleanup storage (#1586)
- Fixes on publish studies handling (#1632)
- Some enhancements and bug fixes (#1608)
- Improve e2e  (#1631)
- filter studies by name before deleting them (#1629)
- Maintenance/upgrades test tools (#1628)
- Bugfix/concurent opening projects (#1598)
- Bugfix/allow reading groups anonymous user (#1615)
- Bump docker from 4.2.1 to 4.2.2 in /packages/postgres-database (#1605)
- fix testing if node has gpu support (#1604)
- [bugfix] Invalidate cache before starting a study (#1602)
- Feature/fix e2e 2 (#1600)
- fix deploy not needing e2e testing since it is disabled
- reduce cardinality of metrics (#1593)
- Excudes e2e stage from include until fixed (#1595)
- Shared project concurrency (frontend) (#1591)
- Homogenize studies and services (#1569)
- [feature] UI Fine grained access - project locking and notification
- Bugfix/apiserver does not need sslheaders (#1564)
- Cleanup catalog service (#1582)
- Maintenance/cleanup api server (#1578)
- Adds support for GPU scheduling of computational services (#1553)
- Maintenance/upgrades and tooling (#1546)
- Is1570/study fails 500 (#1572)
- Bump faker from 4.1.0 to 4.1.1 in /packages/postgres-database (#1573)
- maintenance fix codecov reports (#1568)
- Manage groups, Share studies (#1512)
- Is/add notebook migration script (#1565)
- Is1269/api-server upgrade (#1475)
- added simcore_webserver_service in pytest simcore package (#1563)
- add traefik endpoint to api-gateway (#1555)
@GitHK GitHK mentioned this pull request Aug 17, 2020
@sanderegg sanderegg mentioned this pull request Aug 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:storage issue related to storage service t:enhancement Improvement or request on an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Synchronization minio -> filedatabase
3 participants