Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: slow compilation of a package with a very large API #53092

Closed
baywet opened this issue May 26, 2022 · 21 comments
Closed

cmd/compile: slow compilation of a package with a very large API #53092

baywet opened this issue May 26, 2022 · 21 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. ToolSpeed
Milestone

Comments

@baywet
Copy link

baywet commented May 26, 2022

What version of Go are you using (go version)?

go1.18.2 windows/amd64

the OS doesn't seem to be making a big difference, and I was experiencing the issue since at least go 17

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\vibiret\AppData\Local\go-build
set GOENV=C:\Users\vibiret\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\vibiret\go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\vibiret\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=C:\Program Files\Go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=C:\Program Files\Go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=go1.18.2
set GCCGO=gccgo
set GOAMD64=v1
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\sources\gitlocal\gorepro\go.mod
set GOWORK=
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\vibiret\AppData\Local\Temp\go-build1532428258=/tmp/go-build -gno-record-gcc-switches

What did you do?

go mod init main
go get github.com/microsoftgraph/[email protected] #takes about 20 seconds on my machine, expected, this is the large package we're maintaining
go get github.com/Azure/azure-sdk-for-go/sdk/[email protected] # takes about 5 seconds, expected
go get github.com/microsoft/[email protected] # takes about 2 seconds, expected

notepad main.go #or code, vim,... whatever you have on your OS

# paste the content from the appendix

go build # takes about 1m30s, long but expected

Now replace the following lines

-result, err := client.Me().Messages().Get()
+result, err := client.Me().MailFoldersById("Inbox").Messages().Get()

Then run go build again, which takes about 30 seconds, not expected, should take a couple of seconds thanks to caching.

What did you expect to see?

Subsequent builds should be in the sub seconds time as the packages should be cached.

What did you see instead?

Takes about the original build time.

Additional context

We're maintaining a large package that's generated from an API description. The API itself is fairly large. The initial ask from our SDK users was about the build time increasing significantly when adding our package and caching seeming like it's not fully working from one go build to another.

With the release of 1.18 and generics, we're already identified opportunities for improvements to generate less code, but haven't acted on it yet. @breml @josharian @dominikh were really helpful with framing the issue and suggested we posted here before making those improvements.

Additionally, the go doc for our package is not working either.

Appendix

main.go content

package main

import (
	"context"
	"fmt"

	azidentity "github.com/Azure/azure-sdk-for-go/sdk/azidentity"
	a "github.com/microsoft/kiota-authentication-azure-go"
	r "github.com/microsoft/kiota-http-go"
	msgraphsdk "github.com/microsoftgraph/msgraph-sdk-go"
)

func main() {
	cred, err := azidentity.NewDeviceCodeCredential(&azidentity.DeviceCodeCredentialOptions{
		TenantID: "placeholder",
		ClientID: "placeholder",
		UserPrompt: func(ctx context.Context, message azidentity.DeviceCodeMessage) error {
			fmt.Println(message.Message)
			return nil
		},
	})
	if err != nil {
		fmt.Printf("Error creating credentials: %v\n", err)
	}
	auth, err := a.NewAzureIdentityAuthenticationProviderWithScopes(cred, []string{"Mail.Read", "Mail.Send"})
	if err != nil {
		fmt.Printf("Error authentication provider: %v\n", err)
		return
	}
	adapter, err := r.NewNetHttpRequestAdapter(auth)
	if err != nil {
		fmt.Printf("Error creating adapter: %v\n", err)
		return
	}
	client := msgraphsdk.NewGraphServiceClient(adapter)
	result, err := client.Me().Messages().Get()
	if err != nil {
		fmt.Printf("Error executing request: %v\n", err)
		return
	}
	if result != nil {
		fmt.Printf("Got a result")
		return
	}
}
@mvdan
Copy link
Member

mvdan commented May 26, 2022

should take a couple of seconds thanks to caching.

To add some context for people coming across this issue: Go's build cache works at the package level, so modifying a package in any way will generally mean rebuilding the entire package. See https://pkg.go.dev/cmd/go#hdr-Build_and_test_caching.

@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 26, 2022
@toothrot toothrot added this to the Backlog milestone May 26, 2022
@baywet
Copy link
Author

baywet commented Jul 12, 2022

Adding @randall77 @griesemer @matloob @rsc @bcmills @golang/pkgsite for visibility here. Sorry about the spam, but we're trying to get traction.

@mvdan
Copy link
Member

mvdan commented Jul 12, 2022

Please don't do that. The majority of the people you pinged don't even work on the compiler regularly.

@baywet
Copy link
Author

baywet commented Jul 12, 2022

Apologies. This is not something I usually do. What's the right process?

@bcmills
Copy link
Contributor

bcmills commented Jul 12, 2022

A good starting point might be to make the issue more actionable by tracing the command to figure out where the bottleneck is.
@matloob added an experimental -debug-trace flag in CL 237683 — could you try running the unexpectedly slow command with -debug-trace=<some file>, and then view it with go tool trace to see where it is spending its time?

That would at least help identify the appropriate maintainers to route this issue to, or perhaps even an opportunity for you to contribute a fix.

@baywet
Copy link
Author

baywet commented Jul 12, 2022

Thanks for the guidance
Here is what I did:

  1. go build -debug-trace="initial.out"
  2. edit the code with a trivial change
  3. go build -debug-trace="final.out"
  4. go tool trace final.out

Which results in

2022/07/12 10:30:59 Parsing trace...
failed to parse trace: not a trace file

Attaching the files.
traces.zip

@bcmills
Copy link
Contributor

bcmills commented Jul 12, 2022

Looks like the trace can be opened by the viewer hosted at https://ui.perfetto.dev/.

@baywet
Copy link
Author

baywet commented Jul 12, 2022

Thanks.
From what I can see, there are three aspects that take a while:

  • load.PackagesAndErrors
  • build dependencies
  • link

It looks to me that we could save approx half the time if the former was running on multiple thread instead of single threaded. I don't know much about the internals though, is there a specific reason why that's running on a single thread at the moment?

For build dependencies, most of the steps "take no time" which I'm assuming means they are being retrieved from cache. However some do take a little bit of time although they haven't changed, could it indicate a cache miss?

For the later, could/should the link result be cached to some extent?

image

@bcmills
Copy link
Contributor

bcmills commented Jul 12, 2022

The threading behavior in the trace doesn't match the actual threading behavior of the program — it's kind of an impedance mismatch between Go's threading model and the model expected by the trace viewer. However, the indexing changes coming up in Go 1.19 are intended to speed up that loading phase, especially on platforms with unusually slow filesystems (such as Windows 😅).

For build dependencies, most of the steps "take no time" which I'm assuming means they are being retrieved from cache. However some do take a little bit of time although they haven't changed, could it indicate a cache miss?

Possibly, yes. And we do expect to see at least one cache miss for package main. 🤔

For the [latter], could/should the link result be cached to some extent?

Unfortunately, linking a Go program is not incremental: if any of the inputs changes, the entire link step must be redone. It might be worth profiling and/or tracing the linker to see what can be done to reduce the latency of that step, but I know a lot of work has already gone into making the linker faster.

For now, the best way to speed up linking is to present the linker with less code: that is, to break up APIs into smaller packages so that the linker doesn't have to process (and discard) a lot of irrelevant code.

@baywet
Copy link
Author

baywet commented Jul 12, 2022

Go 1.19 speeding up loading

Great, looking forward to that version! Are you looking for feedback on those aspects? (i.e. would you like me to run the trace with go 1.19 and see if we get different results?)

cache miss

Any action I should take to investigate that further?

no linking cache

Understood, thanks for the context. Is that work going to be part of Go1.19 as well?

reduce the code

We already identified a couple of actions we'll take or the API users can take to reduce the packages sizes. I was encouraged by the community using our SDK to reach out here before they are implemented because even if our code-base is large, some of the behaviour that was observed seemed odd.

@baywet
Copy link
Author

baywet commented Jul 12, 2022

Another aspect we didn't get into so far is the documentation for the models sub package is failing to display https://pkg.go.dev/github.com/microsoftgraph/[email protected]/models

@bcmills
Copy link
Contributor

bcmills commented Jul 12, 2022

That is a truly enormous number of source files. I would suggest splitting up the package into smaller, self-contained units rather than trying to get the various bits of toolchain to deal gracefully with such a monolith.

@baywet
Copy link
Author

baywet commented Jul 13, 2022

We unfortunately don't have that option in our case, this package is representing and API that is a graph (no relation with GraphQL). All the types and endpoints are intricated, we've tried to make educated guesses to ship sub-packages on arbitrary divisions in the past but it's an ever-going debate of "you didn't split the cake the way I wanted it".
Outside of the full service library, people also have the choice to slice the API surface however they like it and get a client for the exact subset they want. But all of that is outside of this discussion.

We could potentially reduce the number of types by half IF we can get rid of the interfaces. The main reason we brought interfaces was because it seemed like the only way to support "downcasting" in Go at the moment.
We have lots of cases where the API defines models that inherit from one another and declares an endpoint that returns the base model and could return the derived ones.
Eg. If we have a Vehicle struct, and a Car and Truck struct that "inherit" from it with a few extra fields, and we have a GetVehicles method which returns a slice of vehicles. The method might return Cars and Trucks in the slice. Now, if people want to know whether they got a car or a truck and access the additional fields:
The following works

myCar := myVehicule.(Carable) // where carable is an interface that reflects all the accessors present on the struct

But this doesn't work

myCar := myVehicule.(*Car)

And the same problem arises for upcasts.

@mvdan
Copy link
Member

mvdan commented Jul 13, 2022

Just to give a bit of perspective: Go tooling generally assumes that Go packages are reasonably sized. Tools like go doc or pkg.go.dev will never be useful on a huge package even if the toolchain became magically fast, for example. Similarly, static analysis tools like staticcheck cache their work at the package level. It may be possible to chip away at performance issues in the toolchain, but ultimately, end users will almost always have a bad user experience with huge Go packages.

@baywet
Copy link
Author

baywet commented Jul 13, 2022

what counts against a package size? types, functions? sub packages?
What I'm trying to understand with this question is which part might be problematic in our current setup:

  • models sub package that contains thousands of types
  • the hierarchical structure made of packages that contains a couple of types at each node level

@bcmills
Copy link
Contributor

bcmills commented Jul 13, 2022

what counts against a package size? types, functions? sub packages?

To a first approximation: the number and size of the declarations in the package. (Types, functions, even constants to some extent.)

Subpackages within the same module are not entirely free, but much lower-cost: individual packages are more amenable to caching (the “compilation unit” for Go is a package, not a source file), and unused packages don't end up causing a lot of wasted work for the compiler and linker.

@baywet
Copy link
Author

baywet commented Jul 13, 2022

Thanks, so the models package is probably the most painful one at this point. Do you have any advice on my downcast question? If we could solve for this, we could divide the number of types by 2.
Would moving those interfaces in a separate package help? it might degrade the experience a little though.

@bcmills
Copy link
Contributor

bcmills commented Jul 13, 2022

Do you have any advice on my downcast question?

Interfaces are usually the most appropriate way to do what you're describing in Go.
But that's also getting to be a bit of a tangent; perhaps we could continue that part of the discussion on microsoftgraph/msgraph-sdk-go#129?

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
@baywet
Copy link
Author

baywet commented Jul 13, 2022

Sure, we can continue the discussion over there, to recap:

  • This package is larger than most go packages out there.
  • The main offender is the models sub package, which even causes issues with go doc.
  • 50% of the models packages are interfaces added to provide downcast support, this is built the preferred way by the Go community and there's no opportunity to get rid of them.
  • caching might be behaving oddly (more on that below)
  • go 1.19 brings more than welcomed performance improvements, but we'll still be on the same factor of scale.

On caching:

I went and revisited the chart, the models package takes about 400ms to load. Given the number of "symbols", is that expected, or should it be considered a cache miss?
Is there a way to get clear data about cache hit and caches misses for packages build?
image

@mknyszek mknyszek moved this to Triage Backlog in Go Compiler / Runtime Jul 15, 2022
@baywet
Copy link
Author

baywet commented Oct 4, 2022

Hi everyone,
Thanks for the help, we've made progress on our performance journey.

What's been successful:

  • removing a bunch of overload functions from the generated code, which effectively reduces the number of symbols
  • reducing the number of parameters on some methods in favor of configuration objects
  • reducing the number of types (structs/interfaces) to maximize reusability (metadata generation pre-process
  • reducing implementation code where possible without needing generics (redundant nil checks etc...)

However, when rolled out generics to reduce further the amount of code (mostly type asserting values or collections), we've noticed significant memory increase. This is OKish on local dev machines as they usually have lots of memory, but it becomes a concern for CI since most agents are memory constrained. That results in the build process being killed by agent.

I'll be updating the repro soon to reflect the changes. In the meantime, what traces would you like me to collect to investigate this memory impact of generics?

@baywet
Copy link
Author

baywet commented Dec 14, 2022

Hi everyone,
After lots of investigations here is a recap of our situation before I close the issue.
We were able to reduce the initial compilation time (no cache) from 12 minutes to 3 minutes (on a standard GitHub agent).
We also reduced cached builds from 3/4 minutes to a couple of seconds.

We rolled back the use of generics, the compiler uses much more memory with generics which leads it to trash the CPU on memory constrained agents (anything under 16GB of ram).

We also flattened most of the packages for the fluent API surface, having a few large packages works much better than having lots of very small packages for the caching system. (e.g. everything that was under /users/id/something is now right away under /users). The only downside of that last change is that pkg.go.dev is now failing for the fluent API surface as well. This is something can live with as we have a public docs platform that not only contains the snippets but also the documentation for the APIs.

The only trick left up our sleave would be to map repeating nodes of the fluent API surface instead of duplicating them, we're not yet committed to that change (this is a decision larger than just the Go client that'd impact the accuracy of our clients with regards to the REST API). We anticipate that last change could yield an additional 15% build time improvement.

I hope those additional details help people running into similar issues and the broader Go community to improve support for large packages in the future. Thanks everyone here and on the issues in our repos for the help getting to the bottom of this! ❤️

Closing.

@baywet baywet closed this as completed Dec 14, 2022
Repository owner moved this from Triage Backlog to Done in Go Compiler / Runtime Dec 14, 2022
@golang golang locked and limited conversation to collaborators Dec 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. ToolSpeed
Projects
None yet
Development

No branches or pull requests

5 participants