-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lifetime tracking for temporary memory objects #12351
Comments
Another maybe simpler way is to do this at the level of functions: If all allocations inside a function turn out to be unused at the end of the function, we can store the free memory pointer on the stack at the beginning of the function and restore it at the end. This should also be stack-pressure-neutral. |
I have updated the description. I thought I'd just update it for the case of allocation with length known only at runtime but then spent most of the day today trying to think my way out of various corner cases. For example turns out that consuming temporaries in anything that short-circuits (ternary op, bool ops, The update to the base part mostly just removed the limitation for dynamic allocations and fleshed out the rules for marking stuff consuming/non-consuming/temporary/permanent. The rest are the optional extensions.
The biggest problem with it is that a function may have conditions and multiple exits, and conditional allocation complicates the codegen part so I think just storing the free memory pointer won't be enough. I think we'd need a tree of stacks to handle it. |
This issue has been marked as stale due to inactivity for the last 90 days. |
Hi everyone! This issue has been automatically closed due to inactivity. |
Abstract
This issue proposes a mechanism for avoiding permanent memory allocation of memory objects that are immediately consumed and not used afterwards. The mechanism works at compilation time by generating code that writes to designated areas in unallocated memory space in a controlled manner, without the need to actually move the free memory pointer at runtime.
Motivation
There are many cases where the allocated memory is either never used (#12306) or could be freed immediately after use (#12335, #4182, #9046). Attempts to deal with them at Yul level (#5107) were unsuccessful so far because at that level we lose some semantic information about variables. Doing it in the code generator at the Solidity level is actually simpler.
This will make memory use less wasteful in many simple cases, e.g. the common
keccak256(abi.encode(...))
pattern.Specification
To be able to perform this optimization we need to classify certain elements of the AST:
memory
parameters of all callables (functions, modifiers, events, errors, constructors, etc.) and inputs of all types of expressions and statements are classified as either consuming or capturing. A memory object is captured if a reference to it remains stored a variable that survives the call. Otherwise we consider it consumed.memory
return values in all expressions present in the source are classified as either temporary or permanent. The result is temporary if no other reference to the same memory object exists.When a temporary object is passed into a consuming argument (which can be determined at compilation time), it is safe to destroy it immediately after the expression is evaluated.
If a permanent allocation of another memory object is necessary before the deallocation of a temporary, the optimization cannot be performed and the temporary is reclassified as permanent (forced permanent).
consuming and capturing arguments
delete
operator is consuming.&&
,||
,require()
) are capturing.==
.temporary and permanent returns
new
, struct constructors, array literals, ABI-encoding functions, etc.Implementation details
Analysis
At the analysis level we need some extra AST annotations:
lifespan: enum {Permanent, ForcedPermanent, Temporary, Unused}
- on every memory return value of every expression.temporaryHandling: enum {Consuming, Capturing}
- on every memory argument of every expression.consumedTemporaries: list of AST node references
- on every expression. Pointers to nodes representing any temporaries that can be deallocated after the expression is evaluated. Includes the result of the expression itself if it's unused.Code generation
The optimization is performed for each statement in isolation. All allocations within the body are tracked using an internal stack (kept only at compilation time). The stack tracks sizes of temporary memory allocations in the order they are performed. The size can be in one of several variants:
constant
: for allocations of constant size known at compilation time. In this case we store the size directly.memory length prefix
: for dynamic memory allocations where the the size is stored in the first slot of allocated memory (this is the case for dynamic arrays). In this case we store the address of the location of memory slot that stores the length and the type of the memory object (to be able to generate an expression that evaluates to size in bytes at runtime).unbounded
: (optional) for temporary allocations performed by the code generator for its own purposes. These should be ephemeral and never happen outside of generated code but we could track them as a sanity check - such an allocation only ever makes sense on the top of the stack and anything else is a bug.The optimization works as follows:
consumedTemporaries
is visited, sizes of consumed temporaries are popped from the stack. As long as the analysis is performed correctly, it can be safely assumed that all of temporaries to be popped are at the top of the stack.Limitations and possible extensions
Pass-through parameters
In the implementation described above only passing a temporary directly into a consuming argument lets us free it. A temporary will not be freed if it's parenthesized (e.g.
(new uint[](5))
), used in a tuple (e.g.(new uint[](1), new uint[](2), new uint[](3))
), used in an array literal (e.g.[new uint[](1), new uint[](2), new uint[](3)]
) or a struct constructor.To deal with this case:
consumedTemporaries
annotation on the expression Y.consumedTemporaries
can be several levels away.Conditional allocation
The current syntax allows allocating memory in expressions that short-circuit, which results in conditional allocation. For example:
condition ? new[](1) : new[](2)
new uint[](2)[0] == new uint[](2)[0] && new uint[](2)[1] == new uint[](2)[1]
require(condition, new uint[](2)[0] <= new uint[](2)[0] ? "a" : "b")
To be able to mark their arguments as pass-through or consuming, we need to replace the stack in the codegen with a binary tree where each node contains a stack. Then to calculate the address of a temporary we need to sum up not just a single stack but all stacks leading up to the tree root.
User-defined functions
The initial implementation forgoes many optimization opportunities when user-defined functions are involved:
Backwards Compatibility
The change is transparent to the users and can only be observed through different gas usage or inline assembly. As such it's considered fully backwards-compatible.
The text was updated successfully, but these errors were encountered: