Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFD: Automatic AWS Server Discovery #12410

Merged
merged 6 commits into from
Sep 1, 2022
Merged

Conversation

lxea
Copy link
Contributor

@lxea lxea commented May 4, 2022

Related: #11628

@github-actions github-actions bot added the rfd Request for Discussion label May 4, 2022
@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch 2 times, most recently from f326312 to 3a00070 Compare May 4, 2022 14:40
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch from 3a00070 to 3797009 Compare May 5, 2022 11:43
@lxea lxea requested a review from r0mant May 5, 2022 11:47
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
Copy link
Contributor

@nklaassen nklaassen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, realized a couple more things after clicking approve

rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch 2 times, most recently from f6cd790 to 2ac517d Compare May 6, 2022 12:21
Copy link
Collaborator

@r0mant r0mant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense in general, left a couple questions.

@klizhentas @xinding33 Could you folks take a look too from the product/UX perspective?

rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
Copy link
Contributor

@xinding33 xinding33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxea This is a really cool feature. I find that for features related to auto discovery and enrollment, drafting user documentation helps a lot in figuring out the right UX. Can you draft some user documentation once most of the existing comments have been addressed?

rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
Copy link
Contributor

@klizhentas klizhentas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a solid start. We should change the way we set this up though to support cloud use cases from the start, what rules out changing auth service configs.

Another thing to think about is that admins most likely will want to change the scripts. Having defaults is helpful, but giving them opportunity to load scripts from file and/or config section of the service going to help going forward.

rfd/0057-automatic-aws-server-discovery.md Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
}
```

EC2 join tokens will have a `tags` field added so so only EC2 instances with matching
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that AWS feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean here, EC2 instances can be tagged and we should be able to match on them

}
```

The installation will run the host systems package manager to install Teleport
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would we store this script? What if users would like to tweak it? Is that going to be a part of the SSH service config?

Copy link
Contributor Author

@lxea lxea May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd intended for this to not be configurable by users, I've updated the document so the shell scripts that get run are specified in a predefined SSM document which gets executed instead of having arbitrary scripts.

This also helps to remove the restriction on the linux distros supported by this feature, which is nice

@klizhentas
Copy link
Contributor

@lxea @r0mant let's also add a security section and UX section to this RFD. Let's explore attack vectors and mitigations and UX examples of how users will perceive this new functionality

@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch 3 times, most recently from 587e9ab to 7eb2e77 Compare May 11, 2022 12:33
@lxea
Copy link
Contributor Author

lxea commented May 11, 2022

I've moved from EC2 join tokens to IAM tokens as they dont have an expiry which makes sense in the context of this feature

The shell command running is also changed, instead of being an unconfigurable set of commands, or a shell command stored in the teleport.yaml file, a user must create a ssm command document which specifies the set of commands to run rather then sending an arbitrary shell script for each install.

I also added a UX example, what other scenarios would be helpful?

Copy link
Contributor

@klizhentas klizhentas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there. Can we simplify the part with the instance document? Can we auto-create one if we have permissions? How can we simplify that part. Also a couple of nits on past time vs present

rfd/0057-automatic-aws-server-discovery.md Show resolved Hide resolved
rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch 3 times, most recently from dfc53d6 to bca8bf7 Compare May 19, 2022 12:46
Comment on lines 72 to 74
IAM join tokens will have a `tags` field added to the accounts so so only EC2
instances with matching `tags` can be added using the token. the tags matching will
default to `"*":"*"` to retain compatibility with previously generated IAM tokens
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious how the instance will prove which tags it has? The IAM join method currently only deals with an sts:GetCallerIdentity request/response, which wouldn't include this. This design involves ec2:DescribeInstances permissions on the SSH agent, but not necessary on Auth where the IAM joining verification will be run.

Which brings up a broader point: my original suggestion on this PR was to use the EC2 join method, not the IAM join method (I know they are similar and most people don't know we have both...), did you consider this? I think it would be better. Docs here in case you're unfamiliar https://goteleport.com/docs/setup/guides/joining-nodes-aws-ec2/

The EC2 join method currently requires ec2:DescribeInstances permissions on auth, but that's probably acceptable for this design. The benefits over the IAM method though are:

  1. The EC2 instance does not need to have an IAM role assigned. This is a requirement for the IAM method.
  2. Since auth is already calling ec2:DescribeInstances, you can probably actually enforce the tags requirement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... sorry, I just remembered the big downside of the EC2 join method is that it doesn't work on cloud specifically because it requires the ec2:DescribeInstances permissions on auth. Not great, sorry I didn't think this through more before commenting.

However, this might be a good opportunity to allow joining with the EC2 method through an SSH agent deployed on the customer's infra, I'm not sure how this would work yet but it could be doable. Would probably require an RFD of its own.

@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch from 0ca1b6d to 6b8fc15 Compare July 15, 2022 12:45
@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch from 6b8fc15 to 0465c93 Compare August 10, 2022 14:46
@r0mant
Copy link
Collaborator

r0mant commented Aug 15, 2022

@klizhentas You requested changes here before, and @lxea has since addressed the feedback, can you take another look? There were 2 outstanding comments from what I could see:

  • Teleport SSH agent attempting to automatically create SSM document. We have updated the RFD to mention this and will make appropriate update to the implementation.
  • Having optional "agentless" mode for auto-discovery. We have also included an optional discovery config setting that would control this in the RFD. It is out of scope initially but we can work on this after initial release if there's still interest.

rfd/0057-automatic-aws-server-discovery.md Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the request for review from marcoandredinis August 15, 2022 18:47
Copy link
Contributor

@klizhentas klizhentas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still have a couple of questions after reading this


Example resource script:
```yaml
kind: installer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a singleton resource? Wouldn't it be better to have multiple script names available for different instance types and specify the script per label?

Copy link
Collaborator

@r0mant r0mant Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it's a singleton, yes. Users can replace it with their own if they want to e.g. to detect instance type, etc. but if we wanted to have ability to have multiple scripts we could make this a non-singleton resource and for example add a field to auto-discovery configuration to optionally specify which install script to use.

ssh_service:
  enabled: "yes"
  aws:
   - types: ["ec2"]
     regions: ["eu-central-1"]
     tags:
       "arch": "ubuntu"
     install:
       script_name: "install-script-ubuntu"
   - types: ["ec2"]
     regions: ["eu-central-1"]
     tags:
       "arch": "centos"
     install:
       script_name: "install-script-centos"

This would require users to mark their AWS nodes with appropriate tags if they want to use different install scripts. Auto-discovery configuration will become dynamic as a part of Teleport Discover Day-2 work so they will be able to update this mapping dynamically as well when we implement it.

@klizhentas @lxea What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good to me, would the script_name option here just be used for parameterize the SSM command run?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxea yes

- types: ["ec2"]
regions: ["us-west-1"]
tags:
"teleport": "yes" # aws tags to match
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I specify what SSM command and script to run? Would it just run one script for everything? Can we add a script doc name as an optional parameter here? We should create a default one, but it can't be singleton

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it works like you describe. We create a default SSM document but you can optionally specify a different one to use. There's a configuration example somewhere above:

  aws:
   - types: ["ec2"]
     regions: ["eu-central-1"]
     tags:
       "teleport": "yes"
     ssm:
       document: "TeleportDiscoveryInstaller" # default value

@lxea Please correct me if I'm wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is correct, responded to comments about different installer scripts above

@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch from 0465c93 to e72d090 Compare August 19, 2022 15:52
@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch from e72d090 to 379f298 Compare August 29, 2022 12:11
@r0mant
Copy link
Collaborator

r0mant commented Aug 31, 2022

@klizhentas We have converted the installer resource from singleton to multiple resources (#15682). Could you take another look and let us know if this lgty now?

@r0mant r0mant requested a review from klizhentas August 31, 2022 14:35
@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch from 379f298 to 4d97212 Compare August 31, 2022 15:19
@lxea lxea enabled auto-merge (squash) August 31, 2022 15:19
@lxea lxea force-pushed the lxea/aws-server-discovery-rfd branch from 4d97212 to dd86984 Compare August 31, 2022 15:54
@lxea lxea merged commit 6adb703 into master Sep 1, 2022
@zmb3 zmb3 deleted the lxea/aws-server-discovery-rfd branch September 9, 2022 18:49
capnspacehook pushed a commit that referenced this pull request Sep 12, 2022
* rfd: Automatic AWS Server Discovery

* Update for the changed config, also mention auto document provisions

* Update a config section

* Clarify that installer scripts are no longer singletons

* Introduce `discovery_service`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfd Request for Discussion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants