Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permissions for Registrator #329

Open
aviks opened this issue Apr 14, 2021 · 34 comments
Open

Permissions for Registrator #329

aviks opened this issue Apr 14, 2021 · 34 comments

Comments

@aviks
Copy link

aviks commented Apr 14, 2021

So we've had users complain that the web registrator hosted on juliahub.com asks for write permissions. I wanted to discuss the options, since this is security stance that affects the registry itself. We need consensus from the registry maintainers about what the correct stance should be.

The underlying requirement is that before the registrator user opens a PR on the registry repo, it needs to be sure that the user requesting the registration has write permissions to the repository.

For the web registrator we need to query github apis to get this information. However, the list of "collaborators" on a package is not available to a user unless that user has write permissions to the repository. This is the reason we ask for write permissions for the Oauth token we get from Github. Also, afaik, we cannot restrict permissions per repository on the Oauth token -- per repository access can be given using a deploy token, but that is not a one-click flow.

As a result of the strength of said feedback, we have this morning termporarily changed juliahub web registrator to use only read permissions. I fear however, this may have been a knee jerk reaction by us. To enable the registrator functionality with read permissions, we are now using the list of "contributors" for a repo. This information is freely accessible. However, this list also includes any users who have had a PR merged to the repo, even if they do not have write access. Additionally, this list precludes any organisation members or admins, who may have write permissions, but have not actually merged any code to that repo. Both of which is problematic. The latter can be solved by asking such users to make their org membership public, but the former (user with PR merged) is a problem.

The commentbot needs the same information, but it works as a Github app that is installed on each repo on which it operates. As a result, it has the permissions required to get these priviledged fields.

So I suppose the main question to the registry maintainers is as follows: are we happy to have a security stance where all "contributors" in a repo can make a registry PR.

Given that most registry PRs are merged automatically, and with the focus on "supply chain attacks", this is an important question that should be carefully considered.

@fredrikekre
Copy link
Member

web registrator hosted on juliahub.com asks for write permissions.

This is nothing new though, right?

are we happy to have a security stance where all "contributors" in a repo can make a registry PR

I don't think so. Wouldn't that essentially mean "everyone" (i.e. you just have to fix a type in the docs or soemthing) can make releases?

@DilumAluthge
Copy link
Member

I don't think so. Wouldn't that essentially mean "everyone" (i.e. you just have to fix a type in the docs or soemthing) can make releases?

Agree. This seems like a big security hole.

@aviks
Copy link
Author

aviks commented Apr 14, 2021

This is nothing new though, right?

Yes, this is not new. But we've had a few users who've expressed strong opinions on the matter. So I wanted to have a public issue with a summary of the situation, and ensure we are not missing anything.

Wouldn't that essentially mean "everyone" (i.e. you just have to fix a type in the docs or soemthing)

Yes. That would indeed be the case.

@DilumAluthge
Copy link
Member

DilumAluthge commented Apr 14, 2021

If people want to register a package, and they don't want to give JuliaHub write access to their repo, they should install the Registrator GitHub App (the "comment bot").

If people want to register a package, and they don't want to give JuliaHub write access to their repo, and also for some reason they don't want to install a GitHub App, we can flesh out RegistratorServerless and make it an official third option. RegistratorServerless.jl:

  1. Does not require any write permissions on your repository
  2. Does not require that you install a GitHub App on your repository

@DilumAluthge
Copy link
Member

DilumAluthge commented Apr 14, 2021

Wouldn't that essentially mean "everyone" (i.e. you just have to fix a type in the docs or soemthing)

Yes. That would indeed be the case.

Yeah, I am strongly opposed to this.

@nkottary
Copy link
Member

Note that anyone can open a PR in the General Registry to register any package regardless of whether they are a contributor or not. I don't think the contributor check therefore can be called a security hole.

@aviks
Copy link
Author

aviks commented Apr 14, 2021

Yes, but registrator PRs are automerged, so that's different, and much more risky.

@DilumAluthge
Copy link
Member

DilumAluthge commented Apr 14, 2021

Note that anyone can open a PR in the General Registry to register any package regardless of whether they are a contributor or not. I don't think the contributor check therefore can be called a security hole.

  1. PRs opened by users that are not JuliaRegistrator or jlbuild are not auto-merged. In contrast, PRs opened by JuliaRegistrator are quickly auto-merged.
  2. If someone opens a manual PR to register a package, the registry maintainers are very careful to check that they should actually have permissions to do so.

@DilumAluthge
Copy link
Member

DilumAluthge commented Apr 14, 2021

One of the core premises of AutoMerge is that we can trust PRs coming from Registrator and jlbuild. The AutoMerge guidelines were based on this premise, and the AutoMerge code base was written with this assumption.

If PRs coming from Registrator can no longer be guaranteed to be coming from someone with the appropriate level of permissions, we should disable AutoMerge.

@DilumAluthge
Copy link
Member

DilumAluthge commented Apr 14, 2021

I'm have temporarily disabled AutoMerge: JuliaRegistries/General#34270 until we resolve this issue.

@nkottary
Copy link
Member

I think we should have a file in the repository that lists the users who can register that package.

@DilumAluthge
Copy link
Member

DilumAluthge commented Apr 14, 2021

I think we should have a file in the repository that lists the users who can register that package.

It's not the worse idea in the world. I had that thought in the past. Someone pointed out to me on Slack that technically there is a security issue if someone changes their GitHub username. E.g. if the file looks like this:

aviks
DilumAluthge

Suppose I change my GitHub username to DilumIsCool. Once I do so, GitHub will make the DilumAluthge username available for anyone to take. So if someone else signs up for GitHub, and picks the username DilumAluthge, now they have the ability to register versions of this package.

@DilumAluthge
Copy link
Member

Out of curiosity, why can't people just install the Registrator GitHub App ("comment bot")? It's easy to use, and it doesn't need any write permissions. (And we don't need a file in the repo with a list of authorized users.) Here are the required permissions:

D4171ACF-9F98-4F86-B59B-ACDD8E66CCD2

Are there people for whom this list of permissions is not acceptable?

@nkottary
Copy link
Member

Suppose I change my GitHub username to DilumIsCool. Once I do so, GitHub will make the DilumAluthge username available for anyone to take. So if someone else signs up for GitHub, and picks the username DilumAluthge, now they have the ability to register versions of this package.

We could use the registered email ID instead.

@DilumAluthge
Copy link
Member

Suppose I change my GitHub username to DilumIsCool. Once I do so, GitHub will make the DilumAluthge username available for anyone to take. So if someone else signs up for GitHub, and picks the username DilumAluthge, now they have the ability to register versions of this package.

We could use the registered email ID instead.

Sure. But this all feels like a solution looking for the problem. I still don't understand why users can't just use the GitHub App and call it a day.

@giordano
Copy link
Member

we can flesh out RegistratorServerless and make it an official third option.

I've been thinking about it lately, but I fear it'd open some backdoors to let anyone register a revision. I need to find some time to think about the workflow more carefully.

@nkottary
Copy link
Member

Sure. But this all feels like a solution looking for the problem. I still don't understand why users can't just use the GitHub App and call it a day.

Does not work for GitLab. And some people like using the WebUI.

@DilumAluthge
Copy link
Member

Sure. But this all feels like a solution looking for the problem. I still don't understand why users can't just use the GitHub App and call it a day.

Does not work for GitLab. And some people like using the WebUI.

Fair enough. Okay, we could look for a file on the repository called registrator_authorized_users.txt, and it would contain the email addresses of users authorized to register packages.

I suppose that this is opt-in only, right? If that file does not exist in the repository, then we don't use this feature.

@DilumAluthge
Copy link
Member

we can flesh out RegistratorServerless and make it an official third option.

I've been thinking about it lately, but I fear it'd open some backdoors to let anyone register a revision. I need to find some time to think about the workflow more carefully.

Definitely we need to do a lot more thinking (and careful review of the code) before we think about deploying RegistratorServerless.

@DilumAluthge
Copy link
Member

Sure. But this all feels like a solution looking for the problem. I still don't understand why users can't just use the GitHub App and call it a day.

Does not work for GitLab. And some people like using the WebUI.

Fair enough. Okay, we could look for a file on the repository called registrator_authorized_users.txt, and it would contain the email addresses of users authorized to register packages.

I suppose that this is opt-in only, right? If that file does not exist in the repository, then we don't use this feature.

This is probably the best way forward for people that absolutely need to use the web UI.

We should think about what identifier we use. Options include:

  1. Email address
  2. GitHub username

I think email address is better than GitHub username.

Is there a better identifier that we could use?

@DilumAluthge
Copy link
Member

@aviks @nkottary I would say that these are the steps forward:

  1. Disable the "allow any user to trigger Registrator" feature. Once this is disabled, we can re-enable AutoMerge.
  2. Add the "text file in the repository with a list of authorized email addresses" feature.

@nkottary
Copy link
Member

Sounds good. I can start working on this. I'm still open to hear from others what they think.

@DilumAluthge
Copy link
Member

DilumAluthge commented Apr 14, 2021

I posted this link in the #pkg-registration Slack channel, so hopefully more registry maintainers will see it. Also, we can directly ping more people if need be.

@GunnarFarneback
Copy link

GunnarFarneback commented Apr 14, 2021

A few observations:

  1. A registration request without bumped version number will fail in Registrator.
  2. If a random contributor can only request registration of merged code they can't get arbitrary code in and depending on how you manage your version bumps the impact can range from a risk of badly timed versions with code in a bad state to none at all (if you only bump version number just before you're going to register it anyway).
  3. If a random contributor can register versions using code from unmerged PRs it's a MAJOR security problem.
  4. Only accepting registrations from the owner of the repository if they don't want to give out write permissions seems likely to go a long way. Or is this also undecidable from the GitHub API?
  5. Anyone can make arbitrary PRs to General. The critical question is what gets merged, manually or automatically.

@nkottary
Copy link
Member

Only accepting registrations from the owner of the repository if they don't want to give out write permissions seems likely to go a long way. Or is this also undecidable from the GitHub API?

We already let the owner register if the package is under the owners account in GitHub. But packages are also in orgs and checking whether you belong to the org requires write access.

@DilumAluthge
Copy link
Member

DilumAluthge commented Apr 14, 2021

But packages are also in orgs and checking whether you belong to the org requires write access.

You can check for public membership without write access. (That's how RegistratorServerless works.) But you can't check if someone is a non-member that has commit access.

@giordano
Copy link
Member

But you can't check if someone is a non-member that has commit access.

And probably if a member doesn't have commit access?

@DilumAluthge
Copy link
Member

DilumAluthge commented Apr 14, 2021

But you can't check if someone is a non-member that has commit access.

And probably if a member doesn't have commit access?

Sure.

But, IIRC, Registrator will allow a member to register even if they don't have commit access, with the logic that if you added someone to your organization, presumably you are okay with them doing certain things (like registering packages) on behalf of the organization.

@nkottary
Copy link
Member

@DilumAluthge We've switched back to checking for collaborator. Can you enable auto-merge on General now?

@DilumAluthge
Copy link
Member

@StefanKarpinski
Copy link
Contributor

Some thoughts:

  • It seems annoying to ask people to create a file with their email ID in it just to register their own package. It would be nice if, in the absence of such a file, the owner of the repo can at least register versions.

  • Presumably we can get a list of each user's email IDs from the GitHub API with only read access? Can we distinguish verified email addresses from non-verified? What happens if one person has claimed an email address and someone else claims the same email address? What if they can both verify the address? If this is going to be the basis for a security mechanism, we should make sure we understand these things.

  • It might be good to include both a user name and an email address and require the email address to be verified (if we can tell that). That way, even if someone else claims an email address, unless they manage to also snag the matching user name, we won't allow them to register versions.

  • Maintaining a consistent list of who can tag versions of a bunch of packages seems like it could get annoying. If we're going to refer to a list, maybe a better way to do it would be to have an optional maintainers = "https://some/url" field in the Project.toml file that we download and use for this. That way a bunch of packages can refer to the same file that is centrally maintained.

  • We need to be careful about which version of the package is consulted for the maintainers list. Obviously you don't want a situation where someone can make PR that isn't merged, which changes the maintainers list and that allows them to tag new versions. The natural version of the project to consult seems like the latest commit of the default branch (main/master, whatever it is).

@nkottary
Copy link
Member

It seems annoying to ask people to create a file with their email ID in it just to register their own package. It would be nice if, in the absence of such a file, the owner of the repo can at least register versions.

This is already the case in JuliaHub. If the user owns the package then we authorize the user right away.

Presumably we can get a list of each user's email IDs from the GitHub API with only read access? Can we distinguish verified email addresses from non-verified? What happens if one person has claimed an email address and someone else claims the same email address? What if they can both verify the address? If this is going to be the basis for a security mechanism, we should make sure we understand these things.

GitHub only lets you make an email public if it is verified. GitHub only sends that one public email in the Oauth response. If a user has no public email IDs then we receive nothing.

It might be good to include both a user name and an email address and require the email address to be verified (if we can tell that). That way, even if someone else claims an email address, unless they manage to also snag the matching user name, we won't allow them to register versions.
Maintaining a consistent list of who can tag versions of a bunch of packages seems like it could get annoying. If we're going to refer to a list, maybe a better way to do it would be to have an optional maintainers = "https://some/url" field in the Project.toml file that we download and use for this. That way a bunch of packages can refer to the same file that is centrally maintained.

Yes, we should do these things. An org can have a public repo for keeping the maintainers file(s) that they want to link to in their package repos.

@nkottary
Copy link
Member

I did some work on this yesterday so I decided to put it in a PR #330. This does not have to be merged right away and we can continue to solidify the ideas here. I think the PR does much of the ground work needed such that we only need to figure out the little important bits like what needs to go in the auth file besides email and how the linking should work etc. Let me know your thoughts.

@StefanKarpinski
Copy link
Contributor

StefanKarpinski commented Apr 22, 2021

I would propose the following sequence of checks to decide who the maintainers are:

  1. If there is a maintainers = entry in the project file where the RHS is not a string, error.
  2. If there is a maintainers = "https://..." entry then the maintainers are downloaded from that URL.
  3. If there is a maintainers = entry whose value is a path to a file in the repo, use that as the maintainers.
  4. If there is a maintainers = entry that is not an https:// URL or a path to a file in the repo, error.
  5. If there is no maintainers entry in the project file but the file MAINTAINERS exists, use that as maintainers.
  6. Otherwise, allow only public org members / the user themselves as maintainers.

This feels like a lot of options but two of them are error cases (1 & 4) and the motivation for the other options is as follows. The simplest case is when someone maintains a package themselves and they don't have to do anything (6). If someone wants to allow some other person to maintain (i.e. tag releases) without messing with hosting a file somewhere, they can just add a file in the top of the project called MAINTAINERS and add that person to the file (5). If they want the file somewhere else in the repo, they can do that by adding a maintainers = "path/to/file" entry in the project file (3). If a group of packages should have the same set of maintainers, then the maintainers entry in the project file can be an https:// URL to a shared file of maintainers (2). This last option is a bit of a pain for standalone packages, so we don't want to force everyone to do this.

If people are ok with that, the next question is what the file format should be. It could just be a list of email addresses (with # treated as comments in the usual way). Are email addresses sufficient for this? Or do we want to have user names + email addresses in case someone manages to claim someone else's email address?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants