-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read multiple blocks if available while loading Metafile contents #1013
Conversation
Fixes #1011 Signed-off-by: Dave Parfitt <[email protected]>
98822f7
to
8e5681b
Compare
I concur - the proposed fix solves the problem. |
@habitat-sh/habitat-core-maintainers this should be reviewed/merged asap! |
Looks good! |
📌 Commit 8e5681b has been approved by |
Read multiple blocks if available while loading Metafile contents Fixes #1011 This adds a `components/core/tests/fixtures/unhappyhumans-possums-8.1.4-20160427165340-x86_64-linux.hart` file with 1024 lines in TDEPS to test with. cc @habitat-sh/habitat-core-maintainers
☀️ Test successful - travis |
Sweet, I was looking for this PR. Nice to see that it's merged. |
oops, just noticed the branch name is off. my bad :-(
|
This changeset contains several logical changes, all of which are coupled together as they were discovered in the process of refactoring the installation logic. The logical changes are broken out into sections below. Pull-Through Local Caching Package Installation ----------------------------------------------- The implementation for package installation was re-written in order to agressively use the artifact cache (by default `/hab/cache/artifacts` on disk). The previous implementation was built up over time with installation from local artifacts added at a later date. Consequently, many installation inconsistencies were present: * When performing an installation from a local package artifact, all dependencies were blindly installed from an upstream Depot--whether or not they were installed or even cached on disk. * When installing from a package identifier string, the fully dependency list was queried from an upstream Depot even though the fetched package artifact contained this metadata. Furthermore, the artifact's own metadata should be trusted over any other source, assuming the package is confirmed "verified". * The local package artifact installation read package dependencies from its metadata file correctly which uncovered a bug (fixed in #1013) which would have been found earlier had all installation strategies uses the same code path. * When performing an installation from a local package artifact, the artifact file was never copied into the artifact cache for future re-use. Neither were any locally available dependency package artifacts. This change introduces a new internal struct called an `InstallTask` that maintains knowledge of various directories as well as a common Depot client which is only constructed once. There are 2 primary installation strategies still present: install from a package identifier string (ex: `core/redis`) or install from a local package artifact (ex: `*.hart`). Effort was made to ensure that both approaches use the same code path as much as possible to minimize future behavioral bugs. The updated installation high level strategies are summarized here: From Package Identifier: 1. Is the given package identifer fully qualified? If not, query the upstream Depot for the latest package identifier that satisfies the given "fuzzy" package identifer. In this way a given `core/redis` may return a fully qualified `core/redis/3.0.7/20160614001713` identifier. 2. Is the specific release already installed for the determined fully qualified package identifier? If so, return early as there is no further work to do. 3. Is there a package artifact for the specific release in the artifact cache? If there is, we use this one. Otherwise, fetch the specific release from the upstream Depot and write the package artifact into the artifact cache. 4. Verify the package artifact to ensure that its fully qualified package identifier precisely matches the intended target package identifier. Next, verify that the package artifact is correctly signed and is self-consistent. 5. Determine the fully list of runtime dependencies by extracting the `TDEPS` metadata file from the local package artifact. For each dependency, perform steps 2 through 4 above so that all dependencies are either confirmed pre-installed or each cached package artifact has been fully verified. 6. Extract/unpack each package artifact dependency that was not found to be pre-installed, followed by the target package artifact last. The order is important to ensure that target package is not installed until all of its dependencies are correctly installed. From Local Package Artifact: 1. Extract the fully qualified package identifier from the `IDENT` metadata file in the package artifact. 2. Is the specific release already installed for the determined fully qualified package identifier? If so, return early as there is no further work to do. 3. Copy the artifact into the artifact cache and rewrite the file name based on the computed artifact name. This means that we can install from a Habitat package with any arbitrary name like `file.1` and it would be rewritten to `core-redis-3.0.7-20160614001713-x86_64-linux.hart` in the artifact cache. 4. Verify the package artifact to ensure that its fully qualified package identifier precisely matches the intended target package identifier. Next, verify that the package artifact is correctly signed and is self-consistent. 5. Determine the fully list of runtime dependencies by extracting the `TDEPS` metadata file from the local package artifact. For each dependency, perform steps 2 through 4 in the "From Package Identifier" section so that all dependencies are either confirmed pre-installed or each cached package artifact has been fully verified. There is one difference however: for each dependency, first check to see if a package artifact exists in the same directory that contained the original target package artifact. There is a good chance that if the directory contains many other package artifacts that we can use these rather than re-fetching the exact same ones from an upstream Depot. Unlike the "From Package Identifier" scenario, we have an explicit directory we can check, so there is no harm in trying. 6. Extract/unpack each package artifact dependency that was not found to be pre-installed, followed by the target package artifact last. The order is important to ensure that target package is not installed until all of its dependencies are correctly installed. New: Supervisor Start From Package Artifact ------------------------------------------- In the course of the installation refactoring above, an easy new improvement (or feature) presented itself: the ability to start supervising a package that has not yet been installed, given a local package artifact. In other words, this change now supports the following: hab start ./results/core-redis-3.0.7-20160614001713-x86_64-linux.hart or: hab sup start /tmp/redis.hart assuming that `/tmp/redis.hart` is a legitimate Habitat package artifact and was simply renamed. As with the installation implementation rewrite above, if the package is already installed, then there is no work to do, otherwise the "From Local Package Artifact" strategy above is used. The intended primary use case for this pattern in for Plan authors working on their Plans in a Studio instance. This will allow them to build a package artifact and them immediately start it without having to install it beforehand. Secondarily, this pattern may be useful in environments with slow or limited internet access and eventually fully "air gapped" environments. Fixed: at-once Update Strategy ------------------------------ Also in the course of the above refactoring 2 bugs were found in the "at-once" update strategy in the Supervisor: 1. Once a new version of a package is found in an upstream Depot, it is installed but the new version is not loaded when `start_package()` is called (source: https://github.com/habitat-sh/habitat/blob/9e25233fafe0433864ab47d0d8ede6344d30be8a/components/sup/src/command/start.rs#L120) with the consequence that the original release is re-loaded and new version is never used until the Supervisor completely restarts. 2. The former custom install logic in the "at-once" strategy only fetched and extracted the new target package meaning that if the new release contained updated dependency versions, these would not be installed and could result in the supervised process entering a crash loop. The fix for this was simply to invoke the same installation logic as used everywhere else in the system and is explained in much greater detail above. Improved: Removed Optional Depot URL in Supervisor -------------------------------------------------- While refactoring the Supervisor component, it became clear that earlier versions of the codebase accepted an *optional* upstream Depot URL. Currently, fall-back default URL values are used and so we can guarentee that a value is always present for the Supervisor. The config value of `url` was updated to simply be a `String` (formerly an `Option<String>`) and all extra checking was removed where a value may not have been present. Less branching logic is almost always a good thing! Bonus: Remove Unused Imports ---------------------------- In the course of regression testing, several "unused import" warnings were found and removed. A few were due to code refactoring, but others were most likely due to upgrading the Rust compiler toolchain in the last few weeks. Just keeping the shop floor swept. Signed-off-by: Fletcher Nichol <[email protected]>
This changeset contains several logical changes, all of which are coupled together as they were discovered in the process of refactoring the installation logic. The logical changes are broken out into sections below. Pull-Through Local Caching Package Installation ----------------------------------------------- The implementation for package installation was re-written in order to aggressively use the artifact cache (by default `/hab/cache/artifacts` on disk). The previous implementation was built up over time with installation from local artifacts added at a later date. Consequently, many installation inconsistencies were present: * When performing an installation from a local package artifact, all dependencies were blindly installed from an upstream Depot--whether or not they were installed or even cached on disk. * When installing from a package identifier string, the fully dependency list was queried from an upstream Depot even though the fetched package artifact contained this metadata. Furthermore, the artifact's own metadata should be trusted over any other source, assuming the package is confirmed "verified". * The local package artifact installation read package dependencies from its metadata file correctly which uncovered a bug (fixed in #1013) which would have been found earlier had all installation strategies uses the same code path. * When performing an installation from a local package artifact, the artifact file was never copied into the artifact cache for future re-use. Neither were any locally available dependency package artifacts. This change introduces a new internal struct called an `InstallTask` that maintains knowledge of various directories as well as a common Depot client which is only constructed once. There are 2 primary installation strategies still present: install from a package identifier string (ex: `core/redis`) or install from a local package artifact (ex: `*.hart`). Effort was made to ensure that both approaches use the same code path as much as possible to minimize future behavioral bugs. The updated installation high level strategies are summarized here: From Package Identifier: 1. Is the given package identifier fully qualified? If not, query the upstream Depot for the latest package identifier that satisfies the given "fuzzy" package identifier. In this way a given `core/redis` may return a fully qualified `core/redis/3.0.7/20160614001713` identifier. 2. Is the specific release already installed for the determined fully qualified package identifier? If so, return early as there is no further work to do. 3. Is there a package artifact for the specific release in the artifact cache? If there is, we use this one. Otherwise, fetch the specific release from the upstream Depot and write the package artifact into the artifact cache. 4. Verify the package artifact to ensure that its fully qualified package identifier precisely matches the intended target package identifier. Next, verify that the package artifact is correctly signed and is self-consistent. 5. Determine the fully list of runtime dependencies by extracting the `TDEPS` metadata file from the local package artifact. For each dependency, perform steps 2 through 4 above so that all dependencies are either confirmed pre-installed or each cached package artifact has been fully verified. 6. Extract/unpack each package artifact dependency that was not found to be pre-installed, followed by the target package artifact last. The order is important to ensure that target package is not installed until all of its dependencies are correctly installed. From Local Package Artifact: 1. Extract the fully qualified package identifier from the `IDENT` metadata file in the package artifact. 2. Is the specific release already installed for the determined fully qualified package identifier? If so, return early as there is no further work to do. 3. Copy the artifact into the artifact cache and rewrite the file name based on the computed artifact name. This means that we can install from a Habitat package with any arbitrary name like `file.1` and it would be rewritten to `core-redis-3.0.7-20160614001713-x86_64-linux.hart` in the artifact cache. 4. Verify the package artifact to ensure that its fully qualified package identifier precisely matches the intended target package identifier. Next, verify that the package artifact is correctly signed and is self-consistent. 5. Determine the fully list of runtime dependencies by extracting the `TDEPS` metadata file from the local package artifact. For each dependency, perform steps 2 through 4 in the "From Package Identifier" section so that all dependencies are either confirmed pre-installed or each cached package artifact has been fully verified. There is one difference however: for each dependency, first check to see if a package artifact exists in the same directory that contained the original target package artifact. There is a good chance that if the directory contains many other package artifacts that we can use these rather than re-fetching the exact same ones from an upstream Depot. Unlike the "From Package Identifier" scenario, we have an explicit directory we can check, so there is no harm in trying. 6. Extract/unpack each package artifact dependency that was not found to be pre-installed, followed by the target package artifact last. The order is important to ensure that target package is not installed until all of its dependencies are correctly installed. New: Supervisor Start From Package Artifact ------------------------------------------- In the course of the installation refactoring above, an easy new improvement (or feature) presented itself: the ability to start supervising a package that has not yet been installed, given a local package artifact. In other words, this change now supports the following: hab start ./results/core-redis-3.0.7-20160614001713-x86_64-linux.hart or: hab sup start /tmp/redis.hart assuming that `/tmp/redis.hart` is a legitimate Habitat package artifact and was simply renamed. As with the installation implementation rewrite above, if the package is already installed, then there is no work to do, otherwise the "From Local Package Artifact" strategy above is used. The intended primary use case for this pattern in for Plan authors working on their Plans in a Studio instance. This will allow them to build a package artifact and them immediately start it without having to install it beforehand. Secondarily, this pattern may be useful in environments with slow or limited internet access and eventually fully "air gapped" environments. Fixed: at-once Update Strategy ------------------------------ Also in the course of the above refactoring 2 bugs were found in the "at-once" update strategy in the Supervisor: 1. Once a new version of a package is found in an upstream Depot, it is installed but the new version is not loaded when `start_package()` is called (source: https://github.com/habitat-sh/habitat/blob/9e25233fafe0433864ab47d0d8ede6344d30be8a/components/sup/src/command/start.rs#L120) with the consequence that the original release is re-loaded and new version is never used until the Supervisor completely restarts. 2. The former custom install logic in the "at-once" strategy only fetched and extracted the new target package meaning that if the new release contained updated dependency versions, these would not be installed and could result in the supervised process entering a crash loop. The fix for this was simply to invoke the same installation logic as used everywhere else in the system and is explained in much greater detail above. Improved: Removed Optional Depot URL in Supervisor -------------------------------------------------- While refactoring the Supervisor component, it became clear that earlier versions of the codebase accepted an *optional* upstream Depot URL. Currently, fall-back default URL values are used and so we can guarantee that a value is always present for the Supervisor. The config value of `url` was updated to simply be a `String` (formerly an `Option<String>`) and all extra checking was removed where a value may not have been present. Less branching logic is almost always a good thing! Bonus: Remove Unused Imports ---------------------------- In the course of regression testing, several "unused import" warnings were found and removed. A few were due to code refactoring, but others were most likely due to upgrading the Rust compiler toolchain in the last few weeks. Just keeping the shop floor swept. Signed-off-by: Fletcher Nichol <[email protected]>
This changeset contains several logical changes, all of which are coupled together as they were discovered in the process of refactoring the installation logic. The logical changes are broken out into sections below. Pull-Through Local Caching Package Installation ----------------------------------------------- The implementation for package installation was re-written in order to aggressively use the artifact cache (by default `/hab/cache/artifacts` on disk). The previous implementation was built up over time with installation from local artifacts added at a later date. Consequently, many installation inconsistencies were present: * When performing an installation from a local package artifact, all dependencies were blindly installed from an upstream Depot--whether or not they were installed or even cached on disk. * When installing from a package identifier string, the fully dependency list was queried from an upstream Depot even though the fetched package artifact contained this metadata. Furthermore, the artifact's own metadata should be trusted over any other source, assuming the package is confirmed "verified". * The local package artifact installation read package dependencies from its metadata file correctly which uncovered a bug (fixed in #1013) which would have been found earlier had all installation strategies uses the same code path. * When performing an installation from a local package artifact, the artifact file was never copied into the artifact cache for future re-use. Neither were any locally available dependency package artifacts. This change introduces a new internal struct called an `InstallTask` that maintains knowledge of various directories as well as a common Depot client which is only constructed once. There are 2 primary installation strategies still present: install from a package identifier string (ex: `core/redis`) or install from a local package artifact (ex: `*.hart`). Effort was made to ensure that both approaches use the same code path as much as possible to minimize future behavioral bugs. The updated installation high level strategies are summarized here: From Package Identifier: 1. Is the given package identifier fully qualified? If not, query the upstream Depot for the latest package identifier that satisfies the given "fuzzy" package identifier. In this way a given `core/redis` may return a fully qualified `core/redis/3.0.7/20160614001713` identifier. 2. Is the specific release already installed for the determined fully qualified package identifier? If so, return early as there is no further work to do. 3. Is there a package artifact for the specific release in the artifact cache? If there is, we use this one. Otherwise, fetch the specific release from the upstream Depot and write the package artifact into the artifact cache. 4. Verify the package artifact to ensure that its fully qualified package identifier precisely matches the intended target package identifier. Next, verify that the package artifact is correctly signed and is self-consistent. 5. Determine the fully list of runtime dependencies by extracting the `TDEPS` metadata file from the local package artifact. For each dependency, perform steps 2 through 4 above so that all dependencies are either confirmed pre-installed or each cached package artifact has been fully verified. 6. Extract/unpack each package artifact dependency that was not found to be pre-installed, followed by the target package artifact last. The order is important to ensure that target package is not installed until all of its dependencies are correctly installed. From Local Package Artifact: 1. Extract the fully qualified package identifier from the `IDENT` metadata file in the package artifact. 2. Is the specific release already installed for the determined fully qualified package identifier? If so, return early as there is no further work to do. 3. Copy the artifact into the artifact cache and rewrite the file name based on the computed artifact name. This means that we can install from a Habitat package with any arbitrary name like `file.1` and it would be rewritten to `core-redis-3.0.7-20160614001713-x86_64-linux.hart` in the artifact cache. 4. Verify the package artifact to ensure that its fully qualified package identifier precisely matches the intended target package identifier. Next, verify that the package artifact is correctly signed and is self-consistent. 5. Determine the fully list of runtime dependencies by extracting the `TDEPS` metadata file from the local package artifact. For each dependency, perform steps 2 through 4 in the "From Package Identifier" section so that all dependencies are either confirmed pre-installed or each cached package artifact has been fully verified. There is one difference however: for each dependency, first check to see if a package artifact exists in the same directory that contained the original target package artifact. There is a good chance that if the directory contains many other package artifacts that we can use these rather than re-fetching the exact same ones from an upstream Depot. Unlike the "From Package Identifier" scenario, we have an explicit directory we can check, so there is no harm in trying. 6. Extract/unpack each package artifact dependency that was not found to be pre-installed, followed by the target package artifact last. The order is important to ensure that target package is not installed until all of its dependencies are correctly installed. New: Supervisor Start From Package Artifact ------------------------------------------- In the course of the installation refactoring above, an easy new improvement (or feature) presented itself: the ability to start supervising a package that has not yet been installed, given a local package artifact. In other words, this change now supports the following: hab start ./results/core-redis-3.0.7-20160614001713-x86_64-linux.hart or: hab sup start /tmp/redis.hart assuming that `/tmp/redis.hart` is a legitimate Habitat package artifact and was simply renamed. As with the installation implementation rewrite above, if the package is already installed, then there is no work to do, otherwise the "From Local Package Artifact" strategy above is used. The intended primary use case for this pattern in for Plan authors working on their Plans in a Studio instance. This will allow them to build a package artifact and them immediately start it without having to install it beforehand. Secondarily, this pattern may be useful in environments with slow or limited internet access and eventually fully "air gapped" environments. Fixed: at-once Update Strategy ------------------------------ Also in the course of the above refactoring 2 bugs were found in the "at-once" update strategy in the Supervisor: 1. Once a new version of a package is found in an upstream Depot, it is installed but the new version is not loaded when `start_package()` is called (source: https://github.com/habitat-sh/habitat/blob/9e25233fafe0433864ab47d0d8ede6344d30be8a/components/sup/src/command/start.rs#L120) with the consequence that the original release is re-loaded and new version is never used until the Supervisor completely restarts. 2. The former custom install logic in the "at-once" strategy only fetched and extracted the new target package meaning that if the new release contained updated dependency versions, these would not be installed and could result in the supervised process entering a crash loop. The fix for this was simply to invoke the same installation logic as used everywhere else in the system and is explained in much greater detail above. Improved: Removed Optional Depot URL in Supervisor -------------------------------------------------- While refactoring the Supervisor component, it became clear that earlier versions of the codebase accepted an *optional* upstream Depot URL. Currently, fall-back default URL values are used and so we can guarantee that a value is always present for the Supervisor. The config value of `url` was updated to simply be a `String` (formerly an `Option<String>`) and all extra checking was removed where a value may not have been present. Less branching logic is almost always a good thing! Bonus: Remove Unused Imports ---------------------------- In the course of regression testing, several "unused import" warnings were found and removed. A few were due to code refactoring, but others were most likely due to upgrading the Rust compiler toolchain in the last few weeks. Just keeping the shop floor swept. Signed-off-by: Fletcher Nichol <[email protected]>
Fixes #1011
This adds a
components/core/tests/fixtures/unhappyhumans-possums-8.1.4-20160427165340-x86_64-linux.hart
file with 1024 lines in TDEPS to test with.cc @habitat-sh/habitat-core-maintainers