Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JuvixCore #1421

Merged
merged 85 commits into from
Aug 30, 2022
Merged

JuvixCore #1421

merged 85 commits into from
Aug 30, 2022

Conversation

lukaszcz
Copy link
Collaborator

@lukaszcz lukaszcz commented Jul 28, 2022

JuvixCore framework and evaluator.

Description

JuvixCore is the Core internal representation for Juvix. It is based on a simple extended lambda-calculus, stripped down enough to enable easy compilation, but high-level enough to enable easy translation from the frontend.

Checklist:

  • Node datatype
  • Info datatype
  • Recursors on Node
  • Evaluator
  • Typeclass instances for Node:
    • Equality (need to ignore Info)
    • Pretty printing
  • Errors
  • Parser for Node
  • Evaluator CLI
  • Evaluator tests
  • Strings
  • IO monad builtins
  • Transformations:
    • Eta-expansion of builtin and constructor applications

@lukaszcz lukaszcz self-assigned this Jul 28, 2022
@lukaszcz lukaszcz added the core Related to JuvixCore label Jul 28, 2022
@lukaszcz
Copy link
Collaborator Author

lukaszcz commented Aug 1, 2022

Regarding the Bound library, it does not seem fit for our purposes because of complexity issues. We end up with Node traversals (umaps, ufolds) in O(n^2). This is not acceptable, because n may be proportional to the size of the input program. We should try to keep traversals at O(n) or close. This seems to be the problem with most approaches based on "locally naming" free variables. The problem is more clearly visible in the paper http://www.cs.ru.nl/~james/RESEARCH/haskell2004.pdf on which the Bound library is partly based. There, both "abstract" and "instantiate" are clearly linear in the input term. The Bound library seems to mitigate this slightly by not requiring to walk into whatever you're substituting for to change the free variables to avoid capture. But you still need to walk the term you're instantiating to find the variables you need to substitute. Hence, it seems both instantiation and abstraction are still linear.

From https://www.slideshare.net/ekmett/bound-making-de-bruijn-succ-less, succ the whole thing:

"If we could succ an entire expression instead of on each individual variable, we would succ less. Instantiation wouldn't have to walk into that expression at all, and we could lift an Exp into Scope in O(1) instead of O(n)."

This may be confusing, but it seems to refer to what I mentioned above: one can "lift" an entire term we substitute for in O(1) without adjusting its variables to avoid capture. But this doesn't refer to the complexity of instantiation as a whole -- you still need to find the variable you want to instantiate which is linear in the term you substitute into. Reading the source code of Bound seems to confirm this.

In any case, you definitely still need a full tree traversal when walking under binders:

From https://www.slideshare.net/ekmett/bound-making-de-bruijn-succ-less, walking under binders:
nf (Lam b) = Lam $ toScope $ nf $ fromScope b

From Hackage description:
toScope :: Monad f => f (Var b a) -> Scope b f a
Convert from traditional de Bruijn to generalized de Bruijn indices.
This requires a full tree traversal

src/Juvix/Core/Evaluator.hs Outdated Show resolved Hide resolved
@lukaszcz lukaszcz force-pushed the juvix-core branch 3 times, most recently from d9e09be to 9d1cf65 Compare August 2, 2022 16:06
@lukaszcz
Copy link
Collaborator Author

lukaszcz commented Aug 3, 2022

Regarding Data.TMap, it has very fast lookup because it's based on arrays, but insertion and deletion are O(n) where n is the number of keys. This should be OK for our purposes, because we have only a small constant number of different infos at any given node, and we'll be doing lookup more often than insertion or deletion.

I still think we should change our Info type currently based on HashMaps to TMap, but we should be aware of the O(n) insertion/deletion issue, in case the above assumptions no longer hold in the future.

@lukaszcz
Copy link
Collaborator Author

lukaszcz commented Aug 3, 2022

The typerep-map package (with TMap) requires base < 4.15 and ghc-prim < 0.7. We have base = 4.16 and ghc-prim = 0.8. So I will keep the HashMap-based implementation for now. We can decide what to do with this later.

@lukaszcz lukaszcz mentioned this pull request Aug 3, 2022
13 tasks
@lukaszcz lukaszcz force-pushed the juvix-core branch 3 times, most recently from 0f534d9 to fdf07df Compare August 11, 2022 15:20
paulcadman
paulcadman previously approved these changes Aug 26, 2022
Copy link
Collaborator

@paulcadman paulcadman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes do not affect the rest of the compiler (yet), are very well tested and @lukaszcz explained the main ideas to us in person. I propose we merge this PR now so we can begin work on integration with the rest of the compiler.

@jonaprieto jonaprieto merged commit 3db92fa into main Aug 30, 2022
@jonaprieto jonaprieto deleted the juvix-core branch August 30, 2022 09:24
@jonaprieto jonaprieto added this to the 0.2.5 milestone Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Related to JuvixCore
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants