Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/auto migration #5

Merged
merged 180 commits into from
Nov 22, 2022
Merged

Conversation

fia0
Copy link
Contributor

@fia0 fia0 commented Nov 2, 2022

Description

This PR contains changes to allow for the usage of automated migration policies in dynamic and adaptable manner. Two policies are implemented which are closer described in https://git.spacesnek.rocks/johannes/master-thesis.

Changes

A wide range of the code has been modified with substantial changes located in the user interface and DMU. The new "migration" module contains relevant changes to any decision processes made in the automated migration process.
Additionally, this PR fixes some issues of the manual migration interface of datasets and object stores, as well as allowing for DMU node storage hints.

Also, a number of fixes of the existing code have been made, notably we no longer deadlock when allocation unexpectedly. Some tests have also been modified and extended to ensure the functionality of newly added features.

Suggestions

We might consider adding https://github.com/jwuensche/lfu-cache to the organization. This repository contains a fork of mine, which extends a crate with some necessary features for our use-case.

Open Issues

In the thesis we lay out some issues which currently still exists in the stack which we should consider solving soon.

  • Active Node Migration
  • Object storage awareness (when using fallback storage)
  • Delayed migration for large objects

Completed

  • Migration policies
  • User configuration of policies
  • DMU modifications
  • Object Store modifications
  • Dataset modifications
  • Minor fixes & optimizations
  • Thorough documentation of the migration module

Johannes Wünsche added 30 commits July 20, 2022 22:01
If migration policies are used they may require the initial state of the
dataset to process node distribution correctly. Depending on the future
implications this has, regarding memory footprint for example, this is
up to change and represents a step into considering policy restrictions
and opportunities.
This allows the tracking of objects over multiple read/write cycles.
Usable in `migration` via `ProfileMsg`.
The event was swapped to the actual result, this has now been fixed.
We might entirely avoid the db lock and depend on the DMU with their
handler for this. TODO
This commits adds a possible value into cache which contains the
previous storage location, to prevent loss of information when evicting
objects from cache where their actual objectref might not be known upon
calling write back on them.
This allows for multiple tests to rely on the same files without
additional checking or depedencies between tests. Problematic add the
moment is that some tests are order dependent due to the storage not
being nulled. Reading some data results in _some_ cases in a checksum
error. This happens on either file mode, indicating some problem
which has not been discovered yet.
This extension provides an interface to allow for system chosen storage
preferences. All nodes now can have an additional preference given which
can upgrade the existing prefrerence based on  it's children/keys.

In combination with Migration Policies upgrades/downgrades of
preferences are possible.
@fia0
Copy link
Contributor Author

fia0 commented Nov 8, 2022

Just finishing up the PR right now, found an error with the tests which needs to be fixed first but then we should be ready to go.

@fia0 fia0 marked this pull request as ready for review November 8, 2022 15:22
@fia0
Copy link
Contributor Author

fia0 commented Nov 8, 2022

Finally ready to be reviewed. Tests are failing now as expected, which we'll have to fix in following PRs.

@fia0 fia0 mentioned this pull request Nov 10, 2022
Johannes Wünsche added 6 commits November 11, 2022 13:37
Removing the option eliminates some error-prone workflows and simplifies the
access structure. The closing logic has been modified to check for _strong_
reference to the dataset, and only when no other than the current exists will
the dataset be closed for good.
@michaelkuhn michaelkuhn merged commit 9b1caac into parcio:main Nov 22, 2022
@fia0 fia0 deleted the feature/auto-migration branch September 20, 2023 06:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants