Skip to content

Commit

Permalink
Add support for spot instances (#55)
Browse files Browse the repository at this point in the history
* Add support for spot instances

* Add test for spot instances
  • Loading branch information
akuzminsky authored Dec 1, 2024
1 parent eb59c40 commit 931e19f
Show file tree
Hide file tree
Showing 13 changed files with 327 additions and 6 deletions.
12 changes: 10 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ module "website" {
userdata = module.webserver_userdata.userdata
stickiness_enabled = true
}
```
### Security groups

Expand All @@ -45,15 +46,20 @@ The module creates two security groups. One for the load balancer, another - for
The load balancer security group allows traffic to TCP ports 443 and `var.alb_listener_port` (80 by default).

The backend security group allows user traffic and health checks coming from the load balancer.
Also, the security group allows SSH from the VPC wehere the backend instances reside and from `var.ssh_cidr_block`.
Also, the security group allows SSH from the VPC where the backend instances reside and from `var.ssh_cidr_block`.
It is 0.0.0.0/0 by default, but the goal is allow user restrict access let's say to anyone but the management VPC.

Both security groups allow incoming ICMP traffic.

Additionally, the user can specify additional security groups via `var.extra_security_groups_backend`.
They will be added to the backend instance alongside with the created backend security group.

```
### Using spot instances

By default, the module launches on-demand instances only. However, if you specify `var.on_demand_base_capacity`,
the ASG will fulfill its capacity by as many on-demand instances as `var.on_demand_base_capacity` and the rest will
be spot instances.

## Requirements

| Name | Version |
Expand Down Expand Up @@ -166,6 +172,7 @@ They will be added to the backend instance alongside with the created backend se
| <a name="input_key_pair_name"></a> [key\_pair\_name](#input\_key\_pair\_name) | SSH keypair name to be deployed in EC2 instances | `string` | n/a | yes |
| <a name="input_max_instance_lifetime_days"></a> [max\_instance\_lifetime\_days](#input\_max\_instance\_lifetime\_days) | The maximum amount of time, in \_days\_, that an instance can be in service, values must be either equal to 0 or between 7 and 365 days. | `number` | `30` | no |
| <a name="input_min_healthy_percentage"></a> [min\_healthy\_percentage](#input\_min\_healthy\_percentage) | Amount of capacity in the Auto Scaling group that must remain healthy during an instance refresh to allow the operation to continue, as a percentage of the desired capacity of the Auto Scaling group. | `number` | `100` | no |
| <a name="input_on_demand_base_capacity"></a> [on\_demand\_base\_capacity](#input\_on\_demand\_base\_capacity) | If specified, the ASG will request spot instances and this will be the minimal number of on-demand instances. | `number` | `null` | no |
| <a name="input_protect_from_scale_in"></a> [protect\_from\_scale\_in](#input\_protect\_from\_scale\_in) | Whether newly launched instances are automatically protected from termination by Amazon EC2 Auto Scaling when scaling in. | `bool` | `false` | no |
| <a name="input_root_volume_size"></a> [root\_volume\_size](#input\_root\_volume\_size) | Root volume size in EC2 instance in Gigabytes | `number` | `30` | no |
| <a name="input_service_name"></a> [service\_name](#input\_service\_name) | Descriptive name of a service that will use this VPC | `string` | `"website"` | no |
Expand All @@ -174,6 +181,7 @@ They will be added to the backend instance alongside with the created backend se
| <a name="input_subnets"></a> [subnets](#input\_subnets) | Subnet ids where load balancer should be present | `list(string)` | n/a | yes |
| <a name="input_tags"></a> [tags](#input\_tags) | Tags to apply to instances in the autoscaling group. | `map(string)` | <pre>{<br/> "Name": "webserver"<br/>}</pre> | no |
| <a name="input_target_group_port"></a> [target\_group\_port](#input\_target\_group\_port) | TCP port that a target listens to to serve requests from the load balancer. | `number` | `80` | no |
| <a name="input_target_group_type"></a> [target\_group\_type](#input\_target\_group\_type) | Target group type: instance, ip, alb. Default is instance. | `string` | `"instance"` | no |
| <a name="input_userdata"></a> [userdata](#input\_userdata) | userdata for cloud-init to provision EC2 instances | `string` | n/a | yes |
| <a name="input_wait_for_capacity_timeout"></a> [wait\_for\_capacity\_timeout](#input\_wait\_for\_capacity\_timeout) | How much time to wait until all instances are healthy | `string` | `"20m"` | no |
| <a name="input_zone_id"></a> [zone\_id](#input\_zone\_id) | Domain name zone ID where the website will be available | `string` | n/a | yes |
Expand Down
24 changes: 21 additions & 3 deletions asg.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,27 @@ resource "aws_autoscaling_group" "website" {
}
triggers = ["tag"]
}
launch_template {
id = aws_launch_template.website.id
version = aws_launch_template.website.latest_version
dynamic "launch_template" {
for_each = var.on_demand_base_capacity == null ? [1] : []
content {
id = aws_launch_template.website.id
version = aws_launch_template.website.latest_version
}
}
dynamic "mixed_instances_policy" {
for_each = var.on_demand_base_capacity == null ? [] : [1]
content {
instances_distribution {
on_demand_base_capacity = var.on_demand_base_capacity
on_demand_percentage_above_base_capacity = 0
}
launch_template {
launch_template_specification {
launch_template_id = aws_launch_template.website.id
version = aws_launch_template.website.latest_version
}
}
}
}
instance_maintenance_policy {
min_healthy_percentage = var.asg_min_healthy_percentage
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ myst-parser ~= 2.0
pytest ~= 7.3
pytest-timeout ~= 2.1
pytest-rerunfailures ~= 12.0
requests ~= 2.31
requests ~= 2.32
77 changes: 77 additions & 0 deletions test_data/test_spot/datasources.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
data "cloudinit_config" "webserver_init" {
gzip = false
base64_encode = true

part {
content_type = "text/cloud-config"
content = join(
"\n",
[
"#cloud-config",
yamlencode(
{
"package_update" : true,
packages : [
"xinetd",
"net-tools"
]
write_files : [
{
path : "/etc/xinetd.d/http"
permissions : "0600"
content : file("${path.module}/xinetd.d.http")
},
{
path : "/usr/local/bin/httpd"
permissions : "0755"
content : file("${path.module}/httpd.sh")
}
]
runcmd : [
"systemctl start xinetd"
]
}
)
]
)
}
}

data "aws_ami" "ubuntu" {
most_recent = true

filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-${var.ubuntu_codename}-*"]
}

filter {
name = "architecture"
values = ["x86_64"]
}

filter {
name = "virtualization-type"
values = ["hvm"]
}

filter {
name = "state"
values = [
"available"
]
}

owners = ["099720109477"] # Canonical
}

data "aws_iam_policy_document" "webserver_permissions" {
statement {
actions = ["ec2:Describe*"]
resources = ["*"]
}
}

data "aws_route53_zone" "website" {
name = var.dns_zone
}
24 changes: 24 additions & 0 deletions test_data/test_spot/httpd.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#!/usr/bin/env bash

http_response () {
HTTP_CODE=$1
MESSAGE=${2:-Message Undefined}
length=$((${#MESSAGE} + 2))
if [[ "$HTTP_CODE" -eq 503 ]]; then
echo -en "HTTP/1.1 503 Service Unavailable\r\n"
elif [[ "$HTTP_CODE" -eq 200 ]]; then
echo -en "HTTP/1.1 200 OK\r\n"
else
echo -en "HTTP/1.1 ${HTTP_CODE} UNKNOWN\r\n"
fi
echo -en "Content-Type: text/plain\r\n"
echo -en "Connection: close\r\n"
echo -en "Content-Length: ${length}\r\n"
echo -en "\r\n"
echo -en "$MESSAGE"
echo -en "\r\n"
sleep 0.1
exit 0
}

http_response 200 "Success Message"
26 changes: 26 additions & 0 deletions test_data/test_spot/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
resource "aws_key_pair" "test" {
public_key = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDpgAP1z1Lxg9Uv4tam6WdJBcAftZR4ik7RsSr6aNXqfnTj4civrhd/q8qMqF6wL//3OujVDZfhJcffTzPS2XYhUxh/rRVOB3xcqwETppdykD0XZpkHkc8XtmHpiqk6E9iBI4mDwYcDqEg3/vrDAGYYsnFwWmdDinxzMH1Gei+NPTmTqU+wJ1JZvkw3WBEMZKlUVJC/+nuv+jbMmCtm7sIM4rlp2wyzLWYoidRNMK97sG8+v+mDQol/qXK3Fuetj+1f+vSx2obSzpTxL4RYg1kS6W1fBlSvstDV5bQG4HvywzN5Y8eCpwzHLZ1tYtTycZEApFdy+MSfws5vPOpggQlWfZ4vA8ujfWAF75J+WABV4DlSJ3Ng6rLMW78hVatANUnb9s4clOS8H6yAjv+bU3OElKBkQ10wNneoFIMOA3grjPvPp5r8dI0WDXPIznJThDJO5yMCy3OfCXlu38VDQa1sjVj1zAPG+Vn2DsdVrl50hWSYSB17Zww0MYEr8N5rfFE= aleks@MediaPC"
}

module "lb" {
source = "../../"
providers = {
aws = aws
aws.dns = aws
}
service_name = "website"
subnets = var.lb_subnet_ids
ami = data.aws_ami.ubuntu.id
backend_subnets = var.backend_subnet_ids
asg_name = var.asg_name
asg_min_size = 2
on_demand_base_capacity = 1
internet_gateway_id = var.internet_gateway_id
zone_id = data.aws_route53_zone.website.zone_id
dns_a_records = var.dns_a_records
key_pair_name = aws_key_pair.test.key_name
userdata = data.cloudinit_config.webserver_init.rendered
health_check_type = "ELB"
instance_profile_permissions = data.aws_iam_policy_document.webserver_permissions.json
instance_role_name = var.instance_role_name
}
23 changes: 23 additions & 0 deletions test_data/test_spot/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
output "network_subnet_public_ids" {
value = var.lb_subnet_ids
}

output "network_subnet_private_ids" {
value = var.backend_subnet_ids
}

output "network_subnet_all_ids" {
value = concat(var.backend_subnet_ids, var.lb_subnet_ids)
}

output "asg_name" {
value = module.lb.asg_name
}

output "instance_profile_name" {
value = module.lb.instance_profile_name
}

output "load_balancer_dns_name" {
value = module.lb.load_balancer_dns_name
}
12 changes: 12 additions & 0 deletions test_data/test_spot/providers.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
provider "aws" {
assume_role {
role_arn = var.role_arn
}
region = var.region
default_tags {
tags = {
"created_by" : "infrahouse/terraform-aws-website-pod" # GitHub repository that created a resource
}

}
}
12 changes: 12 additions & 0 deletions test_data/test_spot/terraform.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.11"
}
cloudinit = {
source = "hashicorp/cloudinit"
version = "~> 2.3"
}
}
}
13 changes: 13 additions & 0 deletions test_data/test_spot/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
variable "region" {}
variable "role_arn" {}
variable "dns_a_records" {
default = ["", "www", "bogus-test-stuff"]
}
variable "dns_zone" {}
variable "ubuntu_codename" {}
variable "asg_name" { default = null }

variable "backend_subnet_ids" {}
variable "lb_subnet_ids" {}
variable "internet_gateway_id" {}
variable "instance_role_name" { default = null }
17 changes: 17 additions & 0 deletions test_data/test_spot/xinetd.d.http
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# default: on
# description: xinetdhttpservice
service xinetdhttpservice
{
# Inspired by: https://github.com/rglaue/xinetd_bash_http_service/blob/master/xinetdhttpservice_config
disable = no
flags = REUSE
socket_type = stream
type = UNLISTED
port = 80
wait = no
user = nobody
server = /usr/local/bin/httpd
log_on_failure += USERID
only_from = 0.0.0.0/0
per_source = UNLIMITED
}
85 changes: 85 additions & 0 deletions tests/test_spot.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
import json
from os import path as osp
from pprint import pformat
from textwrap import dedent

import pytest
from infrahouse_toolkit.terraform import terraform_apply

from tests.conftest import (
TEST_ZONE,
REGION,
UBUNTU_CODENAME,
TRACE_TERRAFORM,
TEST_ROLE_ARN,
TEST_TIMEOUT,
wait_for_instance_refresh,
LOG,
)


@pytest.mark.timeout(TEST_TIMEOUT)
def test_lb(
service_network,
ec2_client,
route53_client,
elbv2_client,
autoscaling_client,
keep_after,
):
subnet_public_ids = service_network["subnet_public_ids"]["value"]
subnet_private_ids = service_network["subnet_private_ids"]["value"]
internet_gateway_id = service_network["internet_gateway_id"]["value"]

terraform_dir = "test_data/test_spot"

with open(osp.join(terraform_dir, "terraform.tfvars"), "w") as fp:
fp.write(
dedent(
f"""
region = "{REGION}"
role_arn = "{TEST_ROLE_ARN}"
dns_zone = "{TEST_ZONE}"
ubuntu_codename = "{UBUNTU_CODENAME}"
lb_subnet_ids = {json.dumps(subnet_public_ids)}
backend_subnet_ids = {json.dumps(subnet_private_ids)}
internet_gateway_id = "{internet_gateway_id}"
"""
)
)

with terraform_apply(
terraform_dir,
destroy_after=not keep_after,
json_output=True,
enable_trace=TRACE_TERRAFORM,
) as tf_output:
asg_name = tf_output["asg_name"]["value"]
wait_for_instance_refresh(asg_name, autoscaling_client)
response = autoscaling_client.describe_auto_scaling_groups(
AutoScalingGroupNames=[
asg_name,
],
)
LOG.debug(
"describe_auto_scaling_groups(%s): %s",
asg_name,
pformat(response, indent=4),
)

healthy_instance = None
for instance in response["AutoScalingGroups"][0]["Instances"]:
LOG.debug("Evaluating instance %s", pformat(instance, indent=4))
if instance["LifecycleState"] == "InService":
healthy_instance = instance
break
assert healthy_instance, f"Could not find a healthy instance in ASG {asg_name}"
healthy_instance_count = len(
[
i
for i in response["AutoScalingGroups"][0]["Instances"]
if i["LifecycleState"] == "InService"
]
)
assert healthy_instance_count == 2
6 changes: 6 additions & 0 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,12 @@ variable "service_name" {
default = "website"
}

variable "on_demand_base_capacity" {
description = "If specified, the ASG will request spot instances and this will be the minimal number of on-demand instances."
type = number
default = null
}

variable "ssh_cidr_block" {
description = "CIDR range that is allowed to SSH into the backend instances. Format is a.b.c.d/<prefix>."
type = string
Expand Down

0 comments on commit 931e19f

Please sign in to comment.