Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/go/ssa: generics support #48525

Closed
Tracked by #2649
timothy-king opened this issue Sep 21, 2021 · 58 comments
Closed
Tracked by #2649

x/tools/go/ssa: generics support #48525

timothy-king opened this issue Sep 21, 2021 · 58 comments
Assignees
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) FrozenDueToAge Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@timothy-king
Copy link
Contributor

timothy-king commented Sep 21, 2021

This is a tracking issue for supporting generics within the x/tools/go/ssa package. This depends heavily on proposals for "go/ast" #47781 and "go/types" #47916.

A reasonable goal is to have ssa available at roughly the same time as the release of generics so tool authors can take advantage of it. Earlier so tool authors that build on top of ssa can support generics at the same time as the release would be even better.

Please see update #48525 (comment) for the current status.

@timothy-king timothy-king added Tools This label describes issues relating to any tools in the x/tools repository. Analysis Issues related to static analysis (vet, x/tools/go/analysis) labels Sep 21, 2021
@timothy-king timothy-king added this to the Go1.18 milestone Sep 21, 2021
@timothy-king timothy-king self-assigned this Sep 21, 2021
@timothy-king
Copy link
Contributor Author

There are two relevant contexts we need to consider:

  1. Whole program tools and packages like x/tools/go/callgraph/{...} that will see the original source and all of the possible instantiation sites.
  2. x/tools/go/analysis/passes/buildssa and similar passes that can have access to just the exported type data for imported packages. ATM in ssa, these are primarily used to build stub functions for functions in the imported packages.

@timothy-king
Copy link
Contributor Author

My current idea is to "monomorphize" or "stencil" the instances of the generics functions. This would create a copy of the function specialized for each type instantiation.

Cons:

  • More copies of the function.
  • Further from the source than an uninstantiated version.
    Pros:
  • Simple for the ssa package to implement.
  • Expect it to be relatively simple for ssa users to adjust their existing checkers/analysis.

As for whether there will be too many copies of functions with this approach, it is not yet clear. I am somewhat optimistic it will not. For whole program analysis, this is roughly k=1 context-sensitivity in all of instantiation locations (which strikes me as a reasonable thing to try anyways).

For incremental program analysis (like buildssa), ssa needs to at least provide a stub for generic functions defined in other packages and locally instantiated. Providing an empty stub for each instantiation won't be too much blowup. Whether such empty stubs would be sufficient for ssa users is not yet clear.

@scott-cotton
Copy link

scott-cotton commented Sep 22, 2021

I think that stencilling has the advantage of preserving the signatures of the ast whereas dictionaries or gcshape stencilling would have fewer copies and be more aligned with the current compiler implementation. However, these are questions of implementation of instantiation whilst the API for using ssa with generics is still open.

For incremental program analysis (like buildssa), ssa needs to at least provide a stub for generic functions defined in other packages and locally instantiated. Providing an empty stub for each instantiation won't be too much blowup. Whether such empty stubs would be sufficient for ssa users is not yet clear.

I am also not sure whether empty stubs would be sufficient. The ssa interpreter comes to mind as one which it seems would require code instantiations, the pointer package may be more easily adapted with code instantiations (it is based on a notion of type size which would become variable where it is not now). Maybe "ssa.value instantiations" instead of code instantiations could be an alternative.

Are there thoughts about providing an instantiation API to users of SSA vs instantiating ast/types and using that at ssa construction time? I would be inclined to explore using the ast/types instantiation APIs to provide an instantiation API on ssa.

@scott-cotton
Copy link

For incremental program analysis (like buildssa), ssa needs to at least provide a stub for generic functions defined in other packages and locally instantiated. Providing an empty stub for each instantiation won't be too much blowup. Whether such empty stubs would be sufficient for ssa users is not yet clear.

I am also not sure whether empty stubs would be sufficient. The ssa interpreter comes to mind as one which it seems would require code instantiations, the pointer package may be more easily adapted with code instantiations (it is based on a notion of type size which would become variable where it is not now). Maybe "ssa.value instantiations" instead of code instantiations could be an alternative.

Ah wait, the interpreter is whole-program. Not so sure about the pointer package, it is in one sense whole-program whereas in another it can be applied to libraries.

@timothy-king
Copy link
Contributor Author

Not so sure about the pointer package, it is in one sense whole-program whereas in another it can be applied to libraries.

x/tools/go/pointer falls into the whole-program paradigm.

Zooming out a bit. If we have an instantiation "p.F[local]" within a package "q", the only stubs solution for buildssa does means we would never see an instance of "p.F[local]". This would create some holes for analysis: like incremental pointer analysis, "is there a reachable instruction that modifies this field", etc.

@timothy-king
Copy link
Contributor Author

the pointer package may be more easily adapted with code instantiations (it is based on a notion of type size which would become variable where it is not now

Non-instantiated generic functions [if we choose to have them] do not need to be considered reachable/real code. So tools could dodge this by skipping them. (Not in love with this, but I think it would be okay?)

@timothy-king
Copy link
Contributor Author

Are there thoughts about providing an instantiation API to users of SSA vs instantiating ast/types and using that at ssa construction time?

@scott-cotton I don't think I understand the question well enough to answer. Can you give more details?

@scott-cotton
Copy link

Are there thoughts about providing an instantiation API to users of SSA vs instantiating ast/types and using that at ssa construction time?

@scott-cotton I don't think I understand the question well enough to answer. Can you give more details?

The intuition I have is to explore making ssa instantiable for consumers of ssa. Something perhaps (very roughly) like ssa.Instantiate(v ssa.Value, ty1, ty2,...) which would do the stencilling (or whatever implementation). Not sure whether you are considering this, but if it is a desirable thing to provide then implementing that may also serve buildssa, whereas if the instantiations are hidden in buildssa, built by instantiating on the ast and then produce the ssa, then providing this seems like it would be harder. This idea is just an intuition I have, it is not really fleshed out...

@scott-cotton
Copy link

Not so sure about the pointer package, it is in one sense whole-program whereas in another it can be applied to libraries.

x/tools/go/pointer falls into the whole-program paradigm.

Zooming out a bit. If we have an instantiation "p.F[local]" within a package "q", the only stubs solution for buildssa does means we would never see an instance of "p.F[local]". This would create some holes for analysis: like incremental pointer analysis, "is there a reachable instruction that modifies this field", etc.

Regarding whole program vs incremental, I think it will help to clarify some things, so we are talking about the same properties:

  1. Is whole program analysing behaviour only reachable from main entry points or does it also include having the whole source but no entry-points? For example, a package with no mains and all its dependants -- is it whole program or not?
  2. Does incremental mean bottom-up (depended upon first, like printf checker detecting and propagating wrappers) or where units of analysis are piece-wise independent (like prove in the compiler on a loop)?

On point 1), godoc -analysis=pointer works on library code and uses x/tools/go/pointer
while in principal the basis of the analysis is only valid for whole program in terms of reachable code from a given set of entry points. It is run anyway and provides information.

Yes I agree the buildssa stubs approach would create holes in what you are calling incremental analysis. I also agree that stubs may be preferable for other tools using ssa from buildssa on non-main programs. That combination of things may also make it interesting to consider an instantiation api as noted in the previous comment: with such a thing callers could decide what to instantiate and if and when.

@timothy-king
Copy link
Contributor Author

Something perhaps (very roughly) like ssa.Instantiate(v ssa.Value, ty1, ty2,...) which would do the stencilling (or whatever implementation).

I think it might be possible to provide this as a feature for *ssa.Function, and *ssa.Type. Arbitrary Values will result in unconnected instructions. I do think it would be helpful to have a clear use case before expanding the API with this though. SSA's API tends not to expose much for users to modify. A convincing case would be something that (1) comes up in practice and (2) is very hard to otherwise support. I think we can wait on this as a feature.

Regarding whole program vs incremental, I think it will help to clarify some things, so we are talking about the same properties:

The distinction I am really interested in is:

  1. All packages.Packages used for building are transitively loaded and built. This is what I was somewhat inaccurately calling wholeprogram.
  2. The current packages.Package is available for 1 package and gcexportdata is available for direct imports (and supports go/types.Importer). This is what I'm referring to as incremental.

godoc -analysis=pointer would fall into the first bucket. buildssa falls into the second bucket.

That combination of things may also make it interesting to consider an instantiation api as noted in the previous comment: with such a thing callers could decide what to instantiate and if and when.

For buildssa, the source[/ast] needed for generation from a direct[/transitive] import is not immediately available. What would we stencil from? It needs to be serialized somewhere in the on disk cache. This is solvable in a variety of ways, but it not currently supported.

A practical implication of this is that we currently expect that in a different package from a generic function, one will only see the stub for an instantiation "math.Max[int]". If "math" did not do this specific instantiation, "math.Max[int]", there will be no instantiated body for "math.Max[int]" at any point available to an analysis.Analyzer. These Analyzers would be expected to summarize "Max" on the Pass looking at "math" and apply this summary/Fact on the instantiation. These are going to be more challenging to write as one would need to deal with type parameters.

@adonovan
Copy link
Member

adonovan commented Jan 12, 2022

Hi, author of go/ssa and go/pointer here. Thanks for tackling this task. To reduce the scope of work, you might want to consider abandoning go/ssa/interp. It was intended as a way to flush out the bugs in the SSA construction algorithm by ensuring fidelity to the dynamic semantics enforced by tests in the standard library, not as a real interpreter. It has already served its purpose. It's kind of a fun pedagogical tool to show the small-step semantics of Go close up, but it's not really a useful library.

Similarly, go/pointer has proven fragile because it must make assumptions about the internal layout of various run-time datastructures that evolve not only from one Go version to the next, but from one day to the next. I no longer believe it is remotely feasible for designers of static analysis tools to keep up with such a stream of changes, nor reasonable to expect users of the tool (or maintainers of libraries) to annotate the aliasing effects of their code. So I don't think pointer analysis algorithms like this are really viable except when analyzing object code for a whole program. Again, feel free to cut your losses.

Of course we shouldn't break compatibility in published libraries, but the notion of compatibility is fuzzy for libraries whose interface must incur breaking changes when the language spec changes.

@timothy-king
Copy link
Contributor Author

To reduce the scope of work, you might want to consider abandoning go/ssa/interp.

TBH go/ssa/interp has proven to be massively useful for testing these changes. After this effort, there might be a 4-5 year stretch of it not being too helpful after this. But for the moment it is too valuable for development to not support it.

As for go/pointer, if we monomorphize generics, this package should "just work". Maybe some minor hiccups? If we 'interpret type parameters' [somehow], that going to require many changes to go/pointer. Similarly if we suggest an instantiation type and monomorphize w.r.t. the suggested type, users will need to understand how to connect the pieces. This would have a big impact on go/pointer. The problems caused by not having an interpretation for uninstantiated type params goes beyond go/pointer though. It'll impact anyone trying to use ssa on generics.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/385774 mentions this issue: go/loader: Initialize (types/Info).Instances field

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/385775 mentions this issue: go/ssa/ssautil: Initialize Instances field.

gopherbot pushed a commit to golang/tools that referenced this issue Feb 15, 2022
Initialize the Instances field of PackageInfo.Info during importing.

Needed for go/ssa and similar users.

Updates golang/go#48525

Change-Id: Ibacf925e677ec6e90068b90a4f381d96c22338cc
Reviewed-on: https://go-review.googlesource.com/c/tools/+/385774
Run-TryBot: Tim King <[email protected]>
gopls-CI: kokoro <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Robert Findley <[email protected]>
Reviewed-by: Michael Matloob <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue Feb 15, 2022
Initializes the Instances field during BuildPackage.

Updates golang/go#48525

Change-Id: I4d60d644443733930528e2109db75589511de6f4
Reviewed-on: https://go-review.googlesource.com/c/tools/+/385775
Run-TryBot: Tim King <[email protected]>
gopls-CI: kokoro <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Robert Findley <[email protected]>
Trust: Tim King <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/386315 mentions this issue: go/ssa: Put type canonicalization on own mutex.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/386316 mentions this issue: go/ssa: Move BasicBlock operations into block.go

gopherbot pushed a commit to golang/tools that referenced this issue May 13, 2022
Simplifies handling *types.Selections by always using a *ssa.selection
internally. Updates the selection during monomorphization.

Updates golang/go#48525

Change-Id: If9cf7a623d3fed060dda41a5b65c46fcfe3d431c
Reviewed-on: https://go-review.googlesource.com/c/tools/+/405557
Reviewed-by: Alan Donovan <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Robert Findley <[email protected]>
gopls-CI: kokoro <[email protected]>
Run-TryBot: Tim King <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue May 16, 2022
Updates golang/go#48525

Change-Id: I7e25ab136dd69ebd50b12894bc893986fc59999b
Reviewed-on: https://go-review.googlesource.com/c/tools/+/402994
Run-TryBot: Zvonimir Pavlinovic <[email protected]>
gopls-CI: kokoro <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Tim King <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
@timothy-king
Copy link
Contributor Author

Status Update – 2022-05-16

x/tools/go/ssa is expected to not crash on code containing generics. If you encounter a panic within the ssa package or any package in x/tools that uses ssa, please file a new issue on the issue tracker.

The SSA emitted for non-generic functions will remain backwards compatible with the previously emitted SSA. A call from a non-generic function to a generic function will contain a Call to an *ssa.Function that is an instantiation of the generic function with an appropriate signature for the given type arguments. Instantiations do not appear within the x/tools/go/analysis/passes/buildssa.SSA.SrcFuncs field.

For the time being, SSA represents source functions and methods that have type parameters as functions with empty bodies. (This behavior is expected to change in a future update. See remaining work)

The ssa.BuilderMode flag ssa.InstantiateGenerics controls how ssa builds generic function instantiations.

  • ssa.InstantiateGenerics is off (default): Instantiations have empty bodies. (Expected to change.)
  • ssa.InstantiateGenerics is on: calls to generic functions are instantiated by concrete type arguments. These instances will build a specialized body from the original syntax and the provided type arguments. This instantiation happens exhaustively. This results in a complete function body instantiation for every instantiation that is required in the ssa.Program. This may be expensive. This mode is likely to be appropriate for whole program analysis or analysis that requires knowing all runtime types to be correct. For “modular” analysis with x/tools/go/analysis, this will only be able to instantiate functions whose bodies are available (i.e. instantiations of generic functions defined in separate package will have empty bodies during compositional analysis).

Tool developers that use ssa will need to decide which mode works best. Our experience with updating ssa users within x/tools (#52504) has been that usage of SSA with generics has been backwards compatible once the appropriate ssa.BuilderMode is selected.

Remaining work – 2022-05-16

This issue is not yet resolved. Here is the expected ongoing work for this issue.

  1. We are extending compilation to create a body for generic functions from the AST regardless of the BuilderMode. There will be at least one generic function per source function. Details for how this will be achieved are still being determined. This mode is intended to support analysis from x/tools/go/analysis.
  2. Create bodies for instantiations when ssa.InstantiateGenerics is off.
  3. We will create a proposal for the public interface changes for SSA. This is expected to include:
  • New Instructions/Values if any [largely dependent on previous steps]
  • Non-backwards compatible updates to existing Instructions
  • Exposing new fields for navigating generic functions [e.g. Function._Origin -> Function.Origin ]
  • Exposing new utility functions for navigating from a source function to instantiations
  1. Determine the long term BuilderMode for x/tools/go/analysis/passes/buildssa.Analyzer.

Thank you everyone for your patience while we roll out updates to this library.

@ianlancetaylor
Copy link
Member

This is currently in the 1.19 milestone. Anything further for 1.19? Should the milestone move to 1.20? Thanks.

@timothy-king
Copy link
Contributor Author

This is still a priority, but as it is not tied directly to the release process I think it is safe to move this to 1.20. (x/tools/go/ssa is not shipped with the Go release.)

@timothy-king
Copy link
Contributor Author

Status Update - 2022-07-20

I have a prototype that creates a body for each generic function from the AST regardless of the BuilderMode. This covers points 1 and 2 from #48525 (comment). The prototype is able to build the compiler's tests in test/typeparam. The prototype still requires review and more extensive testing, but this is a significant milestone. I expect to make the changes publicly available for review and submit a proposal for the API changes in August.

Thank you for your continued patience.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/425496 mentions this issue: go/ssa: build generic function bodies

@timothy-king
Copy link
Contributor Author

Status Update - 2022-08-31

We are pleased to share a candidate implementation of generic types and functions in go/ssa. We are currently soliciting feedback, though we expect that it is close to its final state and hope to submit it in September, at which point we will close this issue.

API changes:

  • For generic functions and methods produced from syntax trees, we build a generic ssa.Function. Type parameters, and types containing them, may now appear as the operands and results of ssa.Instructions. For example, a hypothetical generic min[T] function would contain an instruction corresponding to the binary operation T < T.
  • Instances of generic functions may be created in one of two forms, determined by the InstantiateGenerics builder mode flag. (The flag has the same behavior as before this update.)
    • When InstantiateGenerics is disabled (the default), each instantiation is materialized as a thin wrapper that delegates to the generic function. For example, given a hypothetical generic slices.Reverse[T] function, the instance Reverse[int] would be a wrapper that calls Reverse[T], coercing its argument from []int to []T and its result from []T to []int.
    • When InstantiateGenerics is enabled, each fully instantiated instance is expanded and specialized for its particular types. In this mode, Reverse[int] would contain the complete logic of slice reversal specialized to integers. Partial instantiations are still materialized as wrappers. For example, a generic method TreeMap[K,V].Contains may be partially instantiated to a wrapper TreeMap[string, V].Contains while TreeMap[string, int].Contains is fully expanded and specialized.
  • The ChangeType instruction can now represent a coercion to or from a parameterized type to an instance of the type.
  • Const values can now represent zero values of any type, including structs, arrays, and type parameters.
  • go/analysis/passes/buildssa builds with the default build mode. (ssa.InstantiateGenerics is off.)
  • ssa.Function has three new methods to help navigate generic functions and instantiations:
    • TypeParams, which returns the function’s list of type parameters;
    • TypeArgs, which returns the instantiations of the function’s type parameters;
    • Origin, which returns the generic function, given one of its instantiations.

go/pointer and go/ssa/interp require ssa.InstantiateGenerics to be enabled, and there are no plans to support these with ssa.InstantiateGenerics off. We are still assessing whether to update the packages in go/callgraph InstantiateGenerics off.

Please let us know if you have any feedback on the interface changes. Feedback on the implementation may be left on Gerrit.

If you are interested in testing this, please patch in https://go-review.git.corp.google.com/c/tools/+/425496. If you encounter bugs, please update this issue.

@findleyr
Copy link
Member

findleyr commented Sep 9, 2022

Congratulations all on the progress, I know this has been an enormous undertaking!

I expect that most interested parties have probably already chimed in, but given that the API has stabilized perhaps now is a good time to promote this issue to a proposal. The proposal committee may choose to simply accept the existing discussion as complete, but we do have a policy that all API changes should have an associated proposal. Having this in the proposal meeting notes will also elevate visibility.

@timothy-king
Copy link
Contributor Author

Status update I posted on #54984 (comment)

There are no known bugs with go/ssa over Google's internal Go code base with BuilderMode(0).

@gopherbot gopherbot moved this to Done in Release Blockers Nov 18, 2022
@golang golang locked and limited conversation to collaborators Nov 18, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) FrozenDueToAge Tools This label describes issues relating to any tools in the x/tools repository.
Projects
Status: Done
Development

No branches or pull requests

13 participants