Add support for method-wise cache invalidation. #71

willow-ahrens · 2021-01-05T22:46:43Z

As requested, part 1 of ? of my previous too-big PR. I believe this patch allows for precompilation-safe method-wise cache invalidation needed in #59. Cache lookup works as follows:

Use Julia's method lookup to find a method based on a target signature. Then look in the module the method was defined in for a global dictionary named __memories__. If the dictionary exists, then look for the method based on the structure of its signature, rather than the Method object itself (for some reason, it seems the the hash identity of the methods changes during precompilation).

We determine if we are overwriting a method by looking up the signature in an earlier world age because the function name we are looking for may not yet be defined. Note that finding a method with which is not enough to know we have overwritten the method: their signatures must be ==.

I have tested the effects of precompilation manually, I'm not sure if there's a way to test for this automatically.

willow-ahrens · 2021-01-09T20:27:43Z

@cstjean Alright, I've added a few tests for precompilation, so I think this is ready for review, unless you'd like me to rebase first. Method caches are stored in the module that defined the method. Methods will always have a cache in the defining module even after precompilation. If the method is called in a module during precompilation, the result will be preserved after precompilation. If, during precompilation, we call a memoized method defined in a different module, that result will not be preserved, since caches are stored in the defining module, not the calling module. I believe this is also a limitation of the current design on master.

willow-ahrens · 2021-01-24T02:42:17Z

This PR can be simplified by using the which(f, args) form included in Base. The reason I included the custom _which(signature) function was to support callable objects in the future. If we have a callable type Foo, I wanted to be able to determine if there was a matching method of a Foo without constructing foo. Note that which(Foo, args...) asks for methods of type Foo rather than object Foo.

I also used the custom version of _which so that I could give it a world age. This allows me to define an anonymous function to get its name, then check if there were any methods with the same name before I just defined the function.

However, since this PR doesn't apply to anonymous functions or callable types, I can simplify it if the reviewers request as such.

willow-ahrens · 2021-01-24T04:09:45Z

I find myself unsure of the desired semantics of cache invalidation and scoping for closures. While this PR makes a lot of sense for global functions, I think that closures might want to use local scope to store their caches. Consider the following example:

function currypow(n)
    function innerpow(x)
        return x^n
    end
end

In this example, each call to currypow will invalidate the cache of the last call to currypow, even though that closure might still be callable. Interestingly, I think it will still allocate dictionaries for each closure to be memoized separately. It's hard to imagine a case where one would want the results of an inner function to be memoized globally, but also want the function to remain an inner function.

We can access variables that were closed over via getproperty, so I think it might be possible to solve this by checking all the closure fieldnames that were gensymed by Memoize to see if one matches the appropriate type signature.

cstjean · 2021-01-24T09:50:08Z

This PR can be simplified by using the which(f, args) form included in Base. The reason I included the custom _which(signature) function was to support callable objects in the future. If we have a callable type Foo, I wanted to be able to determine if there was a matching method of a Foo without constructing foo. Note that which(Foo, args...) asks for methods of type Foo rather than object Foo.

Thank you for clearing that up, I wondered. Supporting callables is neat, but I'm not sure it warrants a custom which. Let's leave it in the PR for now...

In this example, each call to currypow will invalidate the cache of the last call to currypow, even though that closure might still be callable.

Yes, that's not ideal. A similar problem is that

function foo()
    @memoize bar(z) = det(z)
    return bar(randn(1000,1000))
end

This should not leak memory (i.e. it shouldn't hold on to the big matrix after foo() has returned)

With #70 (comment), it wouldn't be an issue, because the memo Dict for @memoize inner_pow would just be a local variable. As you say,

I think that closures might want to use local scope to store their caches

👍

…l functions for no good reason.

willow-ahrens · 2021-01-24T21:09:38Z

Supporting local scope is trickier than I thought, but I think it is accomplished in willow-ahrens#1. The trickiest part of method overwriting, etc. in local scope is that the method is sometimes callable before the definition is executed and the cache gets initialized. Consider

function foo()
    function bar()
        return 1
    end
    bar()
    qux = 2
    function bar()
        return qux
    end
end

which, when called, gives UndefVarError: qux not defined. I think that it would be fair to disallow calling memoized functions before their definitions are evaluated.

cstjean · 2021-01-24T21:42:03Z

Calling a local function before defining it is bad style as far as I'm concerned, so I wouldn't worry about it.

Add scope-specific cache management to JuliaCollections#71

cstjean · 2021-01-24T22:27:23Z

src/Memoize.jl

+        local $cache = ($tail, $cache_dict)
+
+        $scope = nothing
+
+        if isdefined($__module__, $(QuoteNode(scope)))
+            function $f end
+
+            # If overwriting a method, empty the old cache.
+            # Notice that methods are hashed by their stored signature
+            try
+                local $meth = which($f, $tail)
+                if $meth.sig == $sig && isdefined($meth.module, :__memories__)
+                    empty!(pop!($meth.module.__memories__, $meth.sig, (nothing, []))[2])
+                end
+            catch
+            end
+        end
+
        $(combinedef(def_dict_unmemoized))
-        Base.@__doc__ $(combinedef(def_dict))
+        local $result = Base.@__doc__($(combinedef(def_dict)))
+
+        if isdefined($__module__, $(QuoteNode(scope)))
+            if !@isdefined __memories__
+                __memories__ = Dict()
+            end
+            # Store the cache so that it can be emptied later
+            local $meth = $which($f, $tail)
+            __memories__[$meth.sig] = $cache
+        end


Maybe I'm wrong, but this is where I believe that my approach will be just a few extra lines (mostly just to build the cache name with Symbol(function_name, arg_names...) instead of ~20. Macro complexity is especially bad, so unless my suggested approach has a fatal flaw, I believe it will be the way I will take.

I suppose our difference of opinion lies in the trade-off between correctness and complexity. I suppose I'll just put my code in a separate package. Thanks for the helpful discussion!

I suppose our difference of opinion lies in the trade-off between correctness and complexity.

Yes, worse is better comes to mind.

I'm sorry that it took so long to reach an impasse. Thank you for all your efforts!

willow-ahrens added 8 commits January 5, 2021 17:31

Add support for method-wise cache invalidation.

f742892

typos

5b96b7f

factor out cache lookup

4ec35f5

style change

28652ad

passes precompile test

4230149

Test precompile limitations

3661204

one more quick test

b5e6dc3

make the description of memories more correct.

24ee4d3

simplified a bit

ddf9b4d

willow-ahrens mentioned this pull request Jan 24, 2021

Remove eval, memoize methodwise, traits, callables, typed caches #70

Closed

willow-ahrens added 6 commits January 24, 2021 11:26

use local scope to store closure records

fe871ae

interesting.

cb11d0c

remove spurious isdefined check

a6b3a1a

scope-speficity to avoid querying world age and calling which in loca…

4e9ab00

…l functions for no good reason.

sneaky way to invalidate caches in closure scope.

04af86b

in the event we rename our function, we must define it first.

6de8680

willow-ahrens added 2 commits January 24, 2021 17:05

Merge pull request #1 from peterahrens/local-scope

7f8b5de

Add scope-specific cache management to JuliaCollections#71

simplify by removing custom which

6057d38

cstjean reviewed Jan 24, 2021

View reviewed changes

willow-ahrens closed this Jan 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for method-wise cache invalidation. #71

Add support for method-wise cache invalidation. #71

willow-ahrens commented Jan 5, 2021 •

edited

Loading

willow-ahrens commented Jan 9, 2021

willow-ahrens commented Jan 24, 2021

willow-ahrens commented Jan 24, 2021

cstjean commented Jan 24, 2021 •

edited

Loading

willow-ahrens commented Jan 24, 2021 •

edited

Loading

cstjean commented Jan 24, 2021 •

edited

Loading

cstjean Jan 24, 2021

willow-ahrens Jan 24, 2021

cstjean Jan 25, 2021

Add support for method-wise cache invalidation. #71

Add support for method-wise cache invalidation. #71

Conversation

willow-ahrens commented Jan 5, 2021 • edited Loading

willow-ahrens commented Jan 9, 2021

willow-ahrens commented Jan 24, 2021

willow-ahrens commented Jan 24, 2021

cstjean commented Jan 24, 2021 • edited Loading

willow-ahrens commented Jan 24, 2021 • edited Loading

cstjean commented Jan 24, 2021 • edited Loading

cstjean Jan 24, 2021

Choose a reason for hiding this comment

willow-ahrens Jan 24, 2021

Choose a reason for hiding this comment

cstjean Jan 25, 2021

Choose a reason for hiding this comment

willow-ahrens commented Jan 5, 2021 •

edited

Loading

cstjean commented Jan 24, 2021 •

edited

Loading

willow-ahrens commented Jan 24, 2021 •

edited

Loading

cstjean commented Jan 24, 2021 •

edited

Loading