Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable spago install to work on advanced Dhall expressions #849

Closed
wants to merge 3 commits into from
Closed

Enable spago install to work on advanced Dhall expressions #849

wants to merge 3 commits into from

Conversation

JordanMartinez
Copy link
Contributor

Description of the change

Fixes #813. This is the same work as #815 but with a very simplified commit history (largely to see how much is being added when tests are not included) that has been updated to work on master. Unlike #815, this also includes an explanation for this PR.

Checklist:

  • Added the change to the "Unreleased" section of the changelog
  • Added some example of the new feature to the README
  • Added a test for the contribution (if applicable)

Overview of PR

Currently, spago install works by finding the first dependencies key in the Dhall expression. If the value stored at the key is an empty list (i.e. [] : List Text) or a list literal (i.e. [ "effect", "prelude"]), spago adds the dependencies to the list and succeeds. Spago will crash when one uses a list append (i.e. config.dependencies # someOtherList) or any expression more complicated than that.

The current spago install implementation takes a very simplistic approach to installing dependencies because of how complex it can be to correctly update the raw Dhall expression's source code (i.e. this PR) without doing something wrong. Enabling spago install to work (and perhaps in the future spago uninstall) requires dealing with more complex possible Dhall expressions.

The present PR enables spago install to work on these more advanced features by taking the below approach:

  1. Fully normalize the spago.dhall file before editing it (i.e. resolve imports like https://github.com/purescript/package-sets/releases/download/psc-0.14.1-20210419/packages.dhall)
  2. Of the dependencies the user wants to install, determine which actually need to be installed (i.e. difference requestedPkgs currentPkgs)
  3. Attempt to produce a new raw Dhall expression by traversing through the expression and applying the update, which should succeed if the Dhall expression matches Spago's expected config schema.
  4. Verify that the newly produced Dhall expression would produce the current unmodified Config value plus the new packages to install
  5. If the verification passes, overwrite the spago.dhall file with the new Dhall expression

Possible Concerns

Obviously, there's no way to guarantee that this command will succeed 100% of the time, but I don't think that's a major concern due to widespread usage of source control. There could be a bug in this implementation that causes it to fail. The verification step above will guarantee that the spago.dhall file isn't edited unless it's a valid change. However, if this check ever produces a false positive, then one's invalid spago.dhall file could cause future spago commands to fail. Moreover, the user would need to revert these changes done by the CLI. Since most if not all users are using source control, this isn't a huge concern to me. In 80% of the cases, this command should work. In the 20% that it doesn't succeed or produces an invalid file that breaks future runs, source control can revert the changes and the user can edit the file manually.

One potential issue is if spago install is run with the offline flag (see #841). If the Internet cannot be accessed, then we can no longer do Step 1, which means there isn't a way to do Step 4. Rather, we'd have to assume that the package exists in the config folder and that the user is ok with creating a spago.dhall file that may not work. Since most people are using source control, I don't personally see this as an issue because discarding these changes is easy.

Another issue with this PR is the maintenance burden it imposes. This PR adds 1.8KLOC to this codebase, most of which are test files, verifying that the implementation is correct. The heart of this PR is the Spago/Config/AST.hs file (680 LOC, a lot of which is documentation). If we ever need to change the config format, at the very least, many of these test files would need to be updated. One possible way to implement this feature without imposing its corresponding maintenance burden is to port the updating code to its own library and have spago depend on it. I'm not sure how that would affect the test files (i.e. which would stay and which would be removed), but it relieves the burden of having to be directly responsible for any issues that may arise.

Implementation Details

There are a few things that make this implementation tricky. A raw Dhall expression...

  1. might have unresolved imports (e.g. the HTTP url above or a reference to a local file).
  2. might have let bindings where editing the binding's definition may introduce a side-effect.
  3. must be traversed correctly

Dealing with Unresolved Imports

Regarding the first... If we are trying to update something in the expression and we hit this case, the implementation doesn't know what the value represented by that import is. In that case, it must make a "best guess" effort. For example, the spago.dhall file always has at least one unresolved import when it references the packages.dhall file. Another example is taking an existing spago.dhall file and adding or overwriting some part of it.
let config = ./spago.dhall in config with sources = config.sources # [ "example/**/*.purs" ]
However, our "best guess" isn't entirely uninformed. When the original raw Dhall expression is fully normalized (i.e. Step 1), we can verify that the expression produces a valid Config value that has a dependencies key whose contents are a list of text (i.e. packages). As long as we traverse through the expression correctly, we know what changes need to be done if we hit such a case.

For example, each of the following expressions below (where some parts of the spago.dhall file has been omitted for brevity) has an unresolved import. The commented out code indicate what the correct update would be.

-- here we know that the import produces a Spago config value
-- because the full normalization part (i.e. Step 1) produces a valid Config value
{- let config = -} ./spago.dhall -- in config with dependencies = config.dependencies # [ "new" ]

-- here we know that the import produces a `List Text` value
-- because the full normalization part (i.e. Step 1) produces a valid Config value
{ dependencies = ./someList.dhall {- # [ "new" ] -}, ... }

These updates are safe because the full normalization (Step 1) has already told us that the expression eventually normalizes to a valid Spago schema. If we cannot do Step 1 (e.g. offline flag is enabled), we can still attempt to make these updates, but all risk is on the user at that point.

Updating let bindings correctly

Regarding the second... This is the issue I ran into here. Let's assume we have the following Dhall expression:

let someValue = [ "a", "b", "c" ]
in { dependencies = someValue, ... }

Ideally, running spago install new would update the above expression to

let someValue = [ "a", "b", "c", "new" ]
in { dependencies = someValue, ... }

rather than

let someValue = [ "a", "b", "c" ]
in { dependencies = someValue # [ "new" ], ... }

It's more idiomatic in style. However, this can (but doesn't necessarily) introduce a side-effect. Let's change the initial Dhall expression to the following (this might be incorrect Dhall, but it gets the point across):

let someValue = [ "a", "b", "c" ]
in { dependencies = someValue, name = List.fold someValue, ... }

Applying the same "stylistic" logic above to this new expression, we unintentionally modify the name field. If we had used the "unsylistic" logic, no side-effect would have occurred.

This might not seem like an issue, but consider the idea of targets. In #681, targets were originally proposed using let bindings, so that one target's dependencies and source globs could be easily depended upon by another target:

let packages = ./packages.dhall
let main = { packages = packages
           , dependencies = [ "dep1" ,"dep2" ]
           , sources = [ "src/**/*.purs" ]
           }
let test = { packages = packages
           , dependencies = main.dependencies # [ "spec" ]
           , sources = main.sources # [ "test/**/*.purs" ]
           }
in { name = "my-project"
   , target = { main, test }
   }

If we run spago install "new" on the main target, we want to update the main binding's dependencies field, so that the change propagates to the test target's dependencies. In other words, we want to use a "stylistic" logic update.

However, this is not as ambiguous as it may appear. So long as we verify that the only change made to the expression was what we intended to do, the "stylistic" logic update is valid. If it fails, we could fallback to the "unsylistic" logic. This PR's current state always uses the "unstylistic" logic. It should be updated to use this "fallback" idea.

Traversing the Expression

Updating a raw Dhall expression entails traversing through it correctly and then making an change in the correct location, so that future parses produces the intended value. We can think of the expression as a Zipper. As long as we know the "shape" of the Zipper, we can traverse through it to the correct location, apply our change, and then recurse back to the root, rebuilding the surrounding structure along the way.

The spago.dhall file format is just an eventual Record expression that eventually refers to a "dependencies" field that eventually refers to a list of text values. By "eventual"/"eventually", I mean there may be other expressions in-between one part (e.g. the root expression) and the next part (e.g. the "dependencies" field).

Fortunately, the traversal logic is pretty straight forward.

  1. Store a stack of keys that represent the fields we need to "go down" before we get to the expression we wish to update. Initially, this is just ["dependencies"].
  2. Start at the root expression
  3. Follow the expression's structure
    • If we come across a field projection (e.g. { config = ./spago.dhall }.config, we add the new field to the stack as we have to "go down" that key before we can "go down" the key currently at the top of the key stack.
    • If we come across a field that matches the key on the stack, we "go down" it and then pop that key off the stack.
    • If the key stack becomes empty, we are in the "dependencies" value's expression. We never want to recurse back out of this place as it may produce the "let binding side-effect" mentioned above.
    • If we come across a List-like value, we apply the update

Finally, for some other context, see this summary I wrote for the Contributors working group call

@f-f
Copy link
Member

f-f commented Feb 6, 2022

Thanks for putting this together @JordanMartinez!
Since that replaces #815 I'll go ahead and close that one. As I mentioned there I'd like to take a stance on this once we start using the new configuration format. I'm putting together a proof of concept of that based on #842, so I'll get back to this soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed to add dependencies. The dependencies field wasn't a List of Strings.
2 participants