Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go: improve modeling of go_package via target generation #13049

Closed
Eric-Arellano opened this issue Sep 30, 2021 · 3 comments · Fixed by #13139
Closed

Go: improve modeling of go_package via target generation #13049

Eric-Arellano opened this issue Sep 30, 2021 · 3 comments · Fixed by #13139
Assignees
Labels
backend: Go Go backend-related issues

Comments

@Eric-Arellano
Copy link
Contributor

We should probably be generating go_package based on the go_module target. This will reduce boilerplate. It also facilitates moving some information onto the go_package target, such as requiring that import_path be set and adding a required field that maps back to the owning go_module target.

@Eric-Arellano Eric-Arellano added the backend: Go Go backend-related issues label Sep 30, 2021
@Eric-Arellano Eric-Arellano self-assigned this Oct 5, 2021
@Eric-Arellano
Copy link
Contributor Author

We need to figure out the generated_name for the Address. I think it should be the subpath relative to the go.mod. If you have project/go.mod and project/util/hello.go, then you'd have project#util or project:mod#util.

That works great, except what do we do if you have project/hello.go? project# doesn't work. Note that in Go, the import path of project/hello.go (with project/go.mod present) would be the same as the module directive in the go.mod.

cc @stuhood @tdyas

@Eric-Arellano
Copy link
Contributor Author

except what do we do if you have project/hello.go? project# doesn't work. Note that in Go, the import path of project/hello.go (with project/go.mod present) would be the same as the module directive in the go.mod.

We could do project#./, meaning the project dir...Weird, but I think better than something like project#pkg which would violate the scheme of project#some_dir. And probably better than project#/.

@kaos
Copy link
Member

kaos commented Oct 5, 2021

Feels like there's a few moving parts and a bunch of assumptions baked in between the lines here. So I'll try and present them here first, to clarify whether I've understood it correctly or not.

Given:

project/
  go.mod  -- with a `module` directive, say "example.net/project"
  main.go  -- with "package main", import path "example.net/project"
  util/
    hello.go  -- with "package util", import path "example.net/project/util"

Also, what are the addresses for the go_modules here?
Is it: project/main.go:main, project/util/hello.go:hello ?

So, there's not a 1-1 mapping between go_module and go_package, right? As a go_package may contain several .go files.. Praxis is to have one go_package per directory, more or less, I believe.. but it is the package directive in the .go file together with its relative path in the project tree.. or something (I'm no Go expert, so all this is a bit vague to me)
If so, then by inspecting the set of available go_modules should yield a set of go_packages..

The generator, could that be seen as the go_mod of the go.mod file? I.e. from project:lib if the BUILD file had a go_mod(name="lib") target.

Now, that was quite a few background assumptions..

To me then, the generated addresses for the go_package targets from the go_mod one could look like this:
project:lib#project,
project:lib#project/util,
...

Ah, now looking at it, I guess my suggestion is to not use . for the current directory, but the directory name explicitly, to make it extra clear and avoid the trailing slash (and also, I think it should come from the module directive of the go.mod file rather than the directory... which should be the same, but still..)

Eric-Arellano added a commit that referenced this issue Oct 6, 2021
Closes #13049. This greatly reduces boilerplate and also allows us to make some fields required like `import_path` and `subpath`, so that we don't have to calculate those in consuming rules like `build_go_pkg.py`.

## The address format

The generated address looks like `project#./dir`. @tdyas offered that this is intuitive for Go developers because they have to say `go build ./dir` already with the leading `./`. 

This solves how to represent when the `_go_internal_package` is in the same dir as the `go_mod`: `project#./`.

This also makes very clear the difference from external packages like `project#rsc.io/quote` vs. internal packages like `project#./dir`.

## Improves dependency inference

Now that the `import_path` field is required for both `_go_internal_package` and `_go_external_package`, we can create a global mapping of `import_path -> pkg_target`. This is necessary for #13114.

This also improves performance. We don't need to call `ResolvedGoPackage` on all the candidate targets a package might depend on to calculate their import paths. We do still need it when resolving the deps of a particular `_go_internal_package`, but we can be lazier when we call that codepath in not evaluating all candidate targets.

### `dependencies` benchmark

As expected, there is no difference because we are finding the dependencies of everything, so we still have to call `ResolvedGoPackage`. The perf gains are only in things sometimes being less eager, which isn't the case here.

Before:

```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache dependencies ::'
  Time (mean ± σ):     26.501 s ±  1.537 s    [User: 29.554 s, System: 24.115 s]
  Range (min … max):   24.928 s … 28.763 s    5 runs
```

After:
```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache dependencies ::'
  Time (mean ± σ):     26.359 s ±  0.526 s    [User: 29.600 s, System: 23.769 s]
  Range (min … max):   25.625 s … 26.993 s    5 runs
```

### `package` benchmark

Before:

```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache package ::'
  Time (mean ± σ):     33.777 s ±  0.248 s    [User: 39.221 s, System: 39.389 s]
  Range (min … max):   33.517 s … 34.062 s    5 runs
```

After:

```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache package ::'
Benchmark #1: ./pants_from_sources --no-pantsd --no-process-execution-local-cache package ::
  Time (mean ± σ):     31.049 s ±  0.702 s    [User: 40.606 s, System: 40.537 s]
  Range (min … max):   30.512 s … 32.273 s    5 runs
```

## TODO: fix `go_binary` inference of `main` field

#13117 added inference of the `main` field for `go_binary`, that it defaults to the `go_package` defined in that directory.

But target generation no longer generates targets actually in each directory. All generated targets are "located" in the BUILD file of the `go_mod`, i.e. their `spec_path` is set to that. So it no longer looks to `AddressSpecs` like there are any targets in each subdirectory, and there are >1 `_go_internal_package` targets in the `go_mod` dir.

Instead, we should use the `subpath` field to determine what directory the targets correspond to.

[ci skip-rust]
[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: Go Go backend-related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants