Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_route53_resource with alias to ELB has changed after 0.7.5 upgrade #9289

Closed
mioi opened this issue Oct 7, 2016 · 44 comments
Closed

aws_route53_resource with alias to ELB has changed after 0.7.5 upgrade #9289

mioi opened this issue Oct 7, 2016 · 44 comments

Comments

@mioi
Copy link
Contributor

mioi commented Oct 7, 2016

Terraform Version

Terraform v0.7.5

Affected Resource(s)

  • aws_route53_resource
  • aws_elb

Terraform Configuration Files

resource "aws_elb" "foobar-elb" {
  name            = "foobar-elb"
  security_groups = ["sg-foobaz"]

  subnets = ["subnet-foobaz"]

  listener {
    instance_port     = 443
    instance_protocol = "tcp"
    lb_port           = 443
    lb_protocol       = "tcp"
  }

  health_check {
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = 5
    target              = "TCP:443"
    interval            = 30
  }

  instances = ["i-1234567"]

  cross_zone_load_balancing   = true
  idle_timeout                = 400
  connection_draining         = true
  connection_draining_timeout = 400
}

resource "aws_route53_record" "foo" {
  zone_id = "ZREDACTED0"
  name    = "foo.bar.baz"
  type    = "A"

  alias {
    name                   = "${aws_elb.foobar-elb.dns_name}"
    zone_id                = "${aws_elb.foobar-elb.zone_id}"
    evaluate_target_health = true
  }
}

Debug Output

(too much to redact)

Panic Output

n/a

Expected Behavior

It should return the No changes. Infrastructure is up-to-date. message when running terraform plan.

Actual Behavior

It shows that it wants to do this:

~ aws_route53_record.foo
    alias.123456789.evaluate_target_health: "true" => "false"
    alias.123456789.name:                   "dualstack.foobar-elb-89012345.us-west-1.elb.amazonaws.com" => ""
    alias.123456789.zone_id:                "ZREDACTED1" => ""
    alias.567890123.evaluate_target_health:  "" => "true"
    alias.567890123.name:                    "" => "foobar-elb-89012345.us-west-1.elb.amazonaws.com"
    alias.567890123.zone_id:                 "" => "ZREDACTED2"

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. upgrade from 0.7.4 to 0.7.5 version of Terraform.
  2. run terraform plan

Important Factoids

it seems like the main difference (as shown in the terraform plan is that previously, with 0.7.4, it prepended dualstack. to the alias name of the route53 record when you ran terraform apply. However, the .tfstate file did not reflect this. With 0.7.5, it seems to want to remove the dualstack. prefix from the ELB name.

References

n/a

@loivis
Copy link

loivis commented Oct 10, 2016

I met the same issue after upgrading to 0.7.5 today and actually ran "terraform apply". During "apply", terraform was telling that it has made the changes but nothing actually happened. When running terraform plan again, it'll show the same plan to change. Seems like the difference will only appear in "plan" phase but not really applied.

aws_route53_record.dlclient: Still modifying... (10s elapsed)
module.varnish.aws_route53_record.api: Still modifying... (10s elapsed)
module.varnish.aws_route53_record.client: Still modifying... (10s elapsed)
aws_route53_record.api: Still modifying... (10s elapsed)
aws_route53_record.tlcbridge: Still modifying... (10s elapsed)
aws_route53_record.dlclient: Still modifying... (20s elapsed)
module.varnish.aws_route53_record.api: Still modifying... (20s elapsed)
module.varnish.aws_route53_record.client: Still modifying... (20s elapsed)
aws_route53_record.api: Still modifying... (20s elapsed)
aws_route53_record.tlcbridge: Still modifying... (20s elapsed)
aws_route53_record.dlclient: Still modifying... (30s elapsed)
module.varnish.aws_route53_record.api: Still modifying... (30s elapsed)
module.varnish.aws_route53_record.client: Still modifying... (30s elapsed)
aws_route53_record.api: Still modifying... (30s elapsed)
aws_route53_record.tlcbridge: Still modifying... (30s elapsed)
aws_route53_record.tlcbridge: Modifications complete
aws_route53_record.api: Modifications complete
aws_route53_record.dlclient: Modifications complete
module.varnish.aws_route53_record.api: Modifications complete
module.varnish.aws_route53_record.client: Modifications complete

Apply complete! Resources: 0 added, 5 changed, 0 destroyed.

@pporada-gl
Copy link

I'm seeing the same thing on 0.7.5.

@vancluever
Copy link
Contributor

Hi all!

Sorry this is an issue! Not too sure when this started happening, but seems like the DNS name in ELB is not being read using the correct process:

    //    Elastic Load Balancing API: Use  DescribeLoadBalancers (http://docs.aws.amazon.com/ElasticLoadBalancing/latest/APIReference/API_DescribeLoadBalancers.html)
    //  to get the value of CanonicalHostedZoneName. Use the same process to get
    // the CanonicalHostedZoneNameId. See HostedZone$Id.

resource_aws_elb.go is reading the dns_name field using DNSName.

Not too sure if this is 100% the issue, but I will try and set up a test when I have a chance here and correct if I can re-produce it, unless someone beats me to it. ;)

@vancluever
Copy link
Contributor

Hey all,

So, I think I jumped the gun on this one. I can't reproduce this, neither in an acceptance test nor by manual execution of a config from 0.7.4, an upgrade to v0.7.5, and then a terraform plan.

The only situation that I'm observing the behavior in is if the Route53 alias is added through the console. In that situation, the console will prepend dualstack to the record (this implies that the ELB supports both IPv4 and IPv6, but funny enough ELB in VPCs have never supported IPv6, from what I've heard).

Can you give us some more details?

  • What region are your ELBs deployed in? I've tried us-west-2 and us-east-1.
  • Has anyone modified these resources in the AWS console since the resources were deployed with terraform apply?
  • Was this configuration or state originally added through the console and imported, either via a tool like terraforming or with Terraform's now-native import functionality?

Thanks!

Also, here is the config I used to test with - it's basically a slight mod of one of the ELB acceptance tests. Not a 1-to-1 but the important parts should be there.

provider "aws" {
  region = "us-west-2"
}

resource "aws_vpc" "azelb" {
  cidr_block           = "10.1.0.0/16"
  enable_dns_hostnames = true

  tags {
    Name = "subnet-vpc"
  }
}

resource "aws_subnet" "public_a_one" {
  vpc_id = "${aws_vpc.azelb.id}"

  cidr_block        = "10.1.1.0/24"
  availability_zone = "us-west-2a"
}

resource "aws_subnet" "public_b_one" {
  vpc_id = "${aws_vpc.azelb.id}"

  cidr_block        = "10.1.7.0/24"
  availability_zone = "us-west-2b"
}

resource "aws_elb" "ourapp" {
  name = "terraform-asg-deployment-example"

  subnets = [
    "${aws_subnet.public_a_one.id}",
    "${aws_subnet.public_b_one.id}",
  ]

  listener {
    instance_port     = 80
    instance_protocol = "http"
    lb_port           = 80
    lb_protocol       = "http"
  }

  cross_zone_load_balancing   = true
  idle_timeout                = 400
  connection_draining         = true
  connection_draining_timeout = 400

  depends_on = ["aws_internet_gateway.gw"]
}

resource "aws_internet_gateway" "gw" {
  vpc_id = "${aws_vpc.azelb.id}"

  tags {
    Name = "main"
  }
}

resource "aws_route53_zone" "elb_zone" {
  name   = "example.com"
  vpc_id = "${aws_vpc.azelb.id}"
}

resource "aws_route53_record" "elb_alias" {
  zone_id = "${aws_route53_zone.elb_zone.zone_id}"
  name    = "elb-alias"
  type    = "A"

  alias {
    name                   = "${aws_elb.ourapp.dns_name}"
    zone_id                = "${aws_elb.ourapp.zone_id}"
    evaluate_target_health = true
  }
}

@mioi
Copy link
Contributor Author

mioi commented Oct 11, 2016

I appreciate your help, @vancluever!

  1. What region are your ELBs deployed in? I've tried us-west-2 and us-east-1.
    • us-east-1
  2. Has anyone modified these resources in the AWS console since the resources were deployed with terraform apply?
    • nope
  3. Was this configuration or state originally added through the console and imported, either via a tool like terraforming or with Terraform's now-native import functionality?
    • they were originally added via the terraform apply

I have also witnessed your observation of how the AWS console prepands dualstack. to the alias name. The behavior that I witnessed with v0.7.4 was that a terraform apply would also cause dualstack. to be prepended to the alias name (automatically), but the issue is that the dualstack. would not be in the alias name in the tfstate file.

When I use v0.7.5 of Terraform, a terraform plan rightly sees the discrepancy between what is actually in Route53 and what is in the tfstate file (i.e. alias name prepended with dualstack. in Route53, no dualstack. prefix in alias name in tfstate).

I suspect that issue #9108 was hiding the underlying issue. Since #9108 was fixed via ae2b8d4, and rolled into v0.7.5, the actual issue (that I'm currently observing) has started happening.

@jaygorrell
Copy link

Just to add a little extra data here, my situation is nearly identical to @mioi.

  1. us-west-2
  2. Never -- everything here is managed through TF, but there's some interesting notes (see below).
  3. Again, we're fully-managed through TF... no external changes.

It's also worth noting that the zone_id (since redacted, you can't tell in the issue) is changing when this happens. These aren't personal so it's fine to share them.

    alias.2598694550.zone_id:                "" => "Z33MTJ483KN6FU"
    alias.3359374463.zone_id:                "Z1H1FL5HABSF5" => ""

This is in addition to the "dualstack" removal piece.

A couple other notes that may match up with others reporting the problem:

  • I'm using a provider alias to manage DNS in a different account
  • My infrastructure is split across 3 aws accounts with one environment in each (dev, stage, prod)
  • The problem occurs on two different DNS entries (out of 12 or so) -- one of them in all 3 accounts and the other in only one of them. They all put DNS into the same central account though.

I'm wondering if the problem is more about the zone_id of an existing ELB's zone_id property than it is about a new DNS entry by chance?

@vancluever
Copy link
Contributor

Thanks for the info @mioi and @jaygorrell!

It definitely has been in the back of my mind that this is more AWS related than Terraform related, with a dash of #9108 causing some issues here as well. I guess the best way to confirm this would be to roll back to v0.7.4 and run terraform plan again and see what happens.

Unfortunately this probably means that my changes of re-producing this from scratch are pretty much nil. :(

If the zone ID is changing, it could be that AWS has changed the ELB zone IDs on their side. Hopefully, that's something that AWS support could answer. At the very least they could confirm that the modification that TF is trying to carry out would not be harmful to your service, and then you could just carry out the apply and have the new state written. Hopefully that's enough to address it, if there's perpetual diff then that's another issue.

PS: I did just check the test config again, and CanonicalHostedZoneName and DNSName are identical.

@jaygorrell
Copy link

Unfortunately, applying doesn't do anything as @loivis mentioned:

I met the same issue after upgrading to 0.7.5 today and actually ran "terraform apply". During "apply", terraform was telling that it has made the changes but nothing actually happened.

The change comes up in the plan continuously.

I'll try to do more testing later today.

@apparentlymart
Copy link
Contributor

I've been seeing a lot of this in our stuff... we noticed it over the last few days, but possibly it was happening since we upgraded to 0.7.5.

Two separate examples, showing it moving from dualstack to not in different parts of our stack:

aws_route53_record.primary: Modifying...
  alias.2114506294.evaluate_target_health: "false" => "false"
  alias.2114506294.name:                   "dualstack.blahblah.us-west-1.elb.amazonaws.com" => ""
  alias.2114506294.zone_id:                "xxxxxxxxKJ0" => ""
  alias.346237404.evaluate_target_health:  "" => "false"
  alias.346237404.name:                    "" => "blahblah.us-west-1.elb.amazonaws.com"
  alias.346237404.zone_id:                 "" => "xxxxxxxxQJA"
~ aws_route53_record.primary
    alias.2037080052.evaluate_target_health: "" => "false"
    alias.2037080052.name:                   "" => "differentblahblah.us-west-2.elb.amazonaws.com"
    alias.2037080052.zone_id:                "" => "xxxxxxxx6FU"
    alias.510692058.evaluate_target_health:  "false" => "false"
    alias.510692058.name:                    "dualstack.differentblahblah.us-west-2.elb.amazonaws.com" => ""
    alias.510692058.zone_id:                 "xxxxxxxxSF5" => ""

These were grabbed by my co-workers and sent over to me, so I don't have much more to report but it is clear from these that it's happening to us in both us-west-1 and us-west-2.

My (unsubstantiated) theory was that something was changing over in AWS over to a new hostname scheme and zone, and during the transition:

  • The Route53 API is intermittently returning the dualstack version of the hostname and a different zone id when we refresh, perhaps in an attempt to "normalize" what's being stored internally for these ELB hostnames.
  • The ELB API is continuing to return the old hostname.
  • Terraform refreshes both, and sees the Route53 API return dualstack and ELB return regular, and thus treats it as a diff to update the alias.

I found some docs that talk about the dualstack prefix that seem to assert that this means both IPv4 and IPv6 support, but that IPv6 is not supported for VPC load balancers. Could it be that Amazon is currently rolling out IPv6 ELB support for VPC load balancers, but have accidentally shipped a Route53 part of the change before the corresponding ELB part? (both the ELBs in my examples above are VPC ones.)

Perhaps someone with access to non-useless AWS support could get some thoughts from the support team on this? 😀

@mrwacky42
Copy link
Contributor

I, too, am having this problem.

I just imported a previously unmanaged ELB into Terraform today. The old Route53 record was errantly a CNAME record, instead of alias... So Terraform had to delete the record and create a new one.
Now TF is flip-flopping between the two records.
(Note I didn't redact the zone IDs, since those are AWS generated and owned zone IDs. Not secret!)

~ aws_route53_record.foo-elb
    alias.1753199741.evaluate_target_health: "" => "true"
    alias.1753199741.name:                   "" => "foo-1556342713.us-west-2.elb.amazonaws.com"
    alias.1753199741.zone_id:                "" => "Z33MTJ483KN6FU"
    alias.3555116034.evaluate_target_health: "true" => "false"
    alias.3555116034.name:                   "dualstack.foo-1556342713.us-west-2.elb.amazonaws.com" => ""
    alias.3555116034.zone_id:                "Z1H1FL5HABSF5" => ""

If I apply this and terraform plan again, terraform will show that it wants to swap the zone IDs again.

I have another ELB with route53 alias, that is configured very much the same. The only difference, I created it with 0.7.4. I just did a terraform plan on that one, and as expected, nothing is reported as needing updation.

Strange.

@jaygorrell
Copy link

jaygorrell commented Oct 15, 2016

This SO post touches on a related issue here, showing it's likely nothing new on the AWS side.
http://stackoverflow.com/questions/35480030/why-are-the-value-for-hosted-zone-id-different-for-elb-and-route-53-alias-target

It seems the issue here is that Terraform is trying to change the hosted zone for these alias records (Z1H1FL5HABSF5) to the hosted zone for ELBs. In my case, that's Z1H1FL5HABSF5 for us-west-2. I think that may be more of a symptom than problem, though, if the actual name is being set wrong (ie. dualstack).

I'm not sure if it's related but I did see a somewhat related change in 0.7.5:
#9125

Pinging @stack72 on perhaps some thoughts if it could possibly be related or not.

@jangrewe
Copy link

Still happens with 0.7.6 =(

@pmoust
Copy link
Contributor

pmoust commented Oct 19, 2016

Could it be that Amazon is currently rolling out IPv6 ELB support for VPC load balancers, but have accidentally shipped a Route53 part of the change before the corresponding ELB part? (both the ELBs in my examples above are VPC ones.)

That's what I believe as well, pinged AWS support about it.

@johnrengelman
Copy link
Contributor

This is hitting us as well. us-east-1. We can't rollback to a previous version because we are using the new us-east-2 region as well :(

@johnrengelman
Copy link
Contributor

Some more info I found: aws/aws-cli#1761 (comment)

@johnrengelman
Copy link
Contributor

Ok, digging through my LBs, I found that one created on 8/4/2016 returns Z3DZXE0Q79N41H for us-east-1, while one create on 8/31/2016 returns Z35SXDOTRQ7X7K.

Z35SXDOTRQ7X7K is the value that Route 53 returns for the ELB alias zone regardless of when the ELB was created.

@pickgr
Copy link
Contributor

pickgr commented Oct 26, 2016

We are also seeing this issue on the latest terraform (0.7.7) but only with some of our aliases/elb's. Seems like it might be related to when they were originally created? I haven't found a work-around yet and even tried deleting the DNS record and having terraform re-create it. The drift still exists afterwards. I'm wondering if destroying and re-creating the ELB itself would work? Unfortunately, that's not really a viable solution for anything in production.

@johnrengelman
Copy link
Contributor

@pickgr yes, deleting the ELB and re-creating it with correct the issue. I have verified this.

@johnrengelman
Copy link
Contributor

I think this changed occurred with the introduction on ALB on August 11.

@jangrewe
Copy link

I tainted all our ELBs and had TF recreate them, but the DNS records still get modified on every run.

@mioi
Copy link
Contributor Author

mioi commented Nov 1, 2016

It seems like PR #9704 should have fixed this, in v0.7.8, but the problem persists.

@stack72
Copy link
Contributor

stack72 commented Nov 2, 2016

Closed via #9704 - please let us know if there is still an issue!

@stack72 stack72 closed this as completed Nov 2, 2016
@johnrengelman
Copy link
Contributor

@stack72 I don't see how this will fix the issue where the ELB hosted zone is different between what the ELB reports and what Route53 says as I noted above.

@jaygorrell
Copy link

@stack72 Why was this closed when @mioi said the fix didn't work?

@stack72 stack72 reopened this Nov 2, 2016
@stack72
Copy link
Contributor

stack72 commented Nov 2, 2016

ok so we have 2 different issues it seems - 1 is the prepending of dualstack and the other is the change of the hosted zone id

Back to the drawing board on this one

@mioi
Copy link
Contributor Author

mioi commented Nov 5, 2016

FTR, the consistent diff of the hosted zone id is somewhat explained in aws/aws-cli#1761 (comment). It seems like AWS is mid-transition to public hosted zones. The unexpected hosted zone id's are even listed in their documentation here:
http://docs.aws.amazon.com/general/latest/gr/rande.html#elb_region

Also, FTR, I created my ELB some time around Jul or early Aug, 2016. For an ELB I created in mid-Sept., I do not see this issue.

@elblivion
Copy link
Contributor

@mioi good find.

I have a weird twist on this. We have 3 envs where I create two A aliases to the same ELB and are getting this error on all 3.

2 are in us-east-1 and 1 in eu-west-1. Of the 2 in us-east-1, for the one with a newer ELB (June 2016 vs May 2016) only one alias record gets this consistent diff - the other envs get diffs for both.

@shorn
Copy link

shorn commented Nov 17, 2016

Possibly relevant post on the google groups: https://groups.google.com/forum/#!topic/terraform-tool/WJevMu-vNso

@mioi - you might want to have a look at it and talk to Mitchell about it (assuming this is the same thing)

@jaygorrell - my situation seems to be quite similar to yours - can you have a look at your state file and console for your route53 record and see if they're inconsistent?

@jangrewe
Copy link

Still happening in 0.7.11

aws_route53_record.alias-varnish: Modifying...
  alias.2097882636.evaluate_target_health: "" => "true"
  alias.2097882636.name:                   "" => "varnish-1007567XXX.eu-west-1.elb.amazonaws.com"
  alias.2097882636.zone_id:                "" => "Z3NF1Z3NOM5OY2"
  alias.2603514164.evaluate_target_health: "true" => "false"
  alias.2603514164.name:                   "varnish-1007567XXX.eu-west-1.elb.amazonaws.com." => ""
  alias.2603514164.zone_id:                "Z32O12XQLNTSW2" => ""

@andrewnoruk
Copy link

andrewnoruk commented Nov 30, 2016

This is still happening for me as well.

Terraform v0.7.9

alias.1236393478.evaluate_target_health: "false" => "false"
alias.1236393478.name:                   "--------------.us-east-1.elb.amazonaws.com." => ""
alias.1236393478.zone_id:                "ZXXXXXXXXXXX7K" => ""
alias.3894576146.evaluate_target_health: "" => "false"
alias.3894576146.name:                   "" => "--------------.us-east-1.elb.amazonaws.com"
alias.3894576146.zone_id:                "" => "ZXXXXXXXXXXX7K"

@ryansch
Copy link
Contributor

ryansch commented Nov 30, 2016

You can work around this issue by setting the zone_id to the value from https://docs.aws.amazon.com/general/latest/gr/rande.html#elb_region. During this transition phase, the ELB resource might lie about the zone_id instead of using the new value from that table.

@jangrewe
Copy link

jangrewe commented Dec 1, 2016

Thanks @ryansch, that workaround does the job. Not nice, but working ;-)

@mioi
Copy link
Contributor Author

mioi commented Dec 1, 2016

I also found a "workaround". I essentially re-created the affected ELBs. After that, I was able to update our version of terraform to 0.7.13 and no longer see the diffs in terraform plan.

@nckslvrmn
Copy link

nckslvrmn commented Dec 1, 2016

+1 I have been experiencing the same problem in every terraform version since 0.7.5 up to 0.7.13. The temporary workaround by calling lower() around your ${aws_elb.elb-name.dns_name} will cause the name to not want to become uppercase again, however there is a trailing period that AWS appends to the alias name in Route53. Additionally, the hosted zone id of the ELB still wants to be changed from the correct R53 zone (according to the elb region doc posted in previous articles) and the current zone id of the ELB endpoint DNS. Heres an example below:

alias.3001947546.name:                   "xxxxxx.us-east-1.elb.amazonaws.com." => ""
alias.3001947546.zone_id:                "Z35SXDOTRQ7X7K" => ""
alias.326979470.evaluate_target_health:  "" => "false"
alias.326979470.name:                    "" => "xxxxxx.us-east-1.elb.amazonaws.com"
alias.326979470.zone_id:                 "" => "Z3DZXE0Q79N41H"

Due to the number of ELBs I have, re-creating them isn't a feasible workaround.

The zone id changing problem lies mainly in AWS. when running awscli to describe the above ELB, the output will contain the "CanonicalHostedZoneNameID" value which although should be the Z35SXDOTRQ7X7K value, is actually Z3DZXE0Q79N41H. I have confirmed that the creation of a new ELB will have the CanonicalHostedZoneNameID be correct and ending with 7X7K but I really hope we wont have to re create these to make that change retroactively. I believe this zone id is the zone id for any ELB built before the rollout of ALBs. Any new ELB is placed in a new hosted zone.

As for the lowercasing and appending a period to the record, Im sure that can be done in the ELB resource code.

@jangrewe
Copy link

jangrewe commented Dec 2, 2016

@mioi what do you mean with "essentially re-created"? I tried with tainting the ELB, which destroys and re-recreates it, but that didn't fix it for me...

@nckslvrmn
Copy link

nckslvrmn commented Dec 2, 2016

Thats interesting that it didn't fix it for you @jangrewe. When i write a new ELB (regardless of configuration) it places the hosted zone id in the correct one ending in 7X7K and doesn't try to modify the DNS entry name. Additionally, I tainted one of our non essential ELBs and after it was re created, it too had the correct hosted zone id and DNS name and now its associated r53 record does not want to change every plan and apply.

My issue with this solution is that it involves the destruction and re creation of every ELB pre the rollout of ALBs. I have a ton of them many of which I would have to spin up a new one by its side and change DNS before killing the old one off to avoid losing traffic during the short downtime and DNS propagation window. For anyone who doesn't mind having to taint each one and re create them, then this seems like a proper solution. I just wonder if AWS has any plans to migrate existing ELB DNS endpoints to its new hosted zone ID.

EDITED: to clarify by the way, tainting and re applying the resource only fixed the hosted zone id problem. Combine a tainting with the lower() around the ELB DNS name and it fixed both my issues for the ELB I tainted.

@nckslvrmn
Copy link

nckslvrmn commented Dec 2, 2016

an additional solution that I'm probably going to implement is to make a map of the current hosted zone IDs and instead of passing the ELBs hosted zone id as the entry, pass the value from the map.

so instead of doing this

resource "aws_route53_record" "test-elb" {
  zone_id = "${aws_route53_zone.test-zone.zone_id}"
  name = "test-elb.${var.domain}"
  type = "A"
  alias {
    name = "${aws_elb.vpc_test-elb.dns_name}"
    zone_id = "${aws_elb.vpc_test-elb.zone_id}"
    evaluate_target_health = false
  }
}

where test-elb is the problematic ELB with the wrong case name and hosted zone ID, do this instead:

variable "route53_hosted_zone_ids" {
  default = {
    us-east-1 = "Z35SXDOTRQ7X7K"
    us-east-2 = "Z3AADJGX6KTTL2"
    us-west-1 = "Z368ELLRRE2KJ0"
    us-west-2 = "Z1H1FL5HABSF5"
    ap-south-1 = "ZP97RAFLXTNZK"
    ap-northeast-2 = "ZWKZPGTI48KDX"
    ap-southeast-1 = "Z1LMS91P8CMLE5"
    ap-southeast-2 = "Z1GM3OXH4ZPM65"
    ap-northeast-1 = "Z14GRHDCWA56QT"
    eu-central-1 = "Z215JYRZR1TBD5"
    eu-west-1 = "Z32O12XQLNTSW2"
    sa-east-1 = "Z2P70J7HTTTPLU"
  }
}

resource "aws_route53_record" "test-elb" {
  zone_id = "${aws_route53_zone.test-zone.zone_id}"
  name = "test-elb.${var.domain}"
  type = "A"
  alias {
    name = "${lower(aws_elb.vpc_test-elb.dns_name)}."
    zone_id = "${lookup(var.route53_hosted_zone_ids, "us-east-1")}"
    evaluate_target_health = false
  }
}

Combined with the lower() call and appending a period, this is an effective workaround for me that didn't involve tainting or otherwise modifying the existing ELB. it is of course a code change for EVERY ELB aliased DNS record I have but I'm more okay with that considering it doesn't mess with the existing ELB. I gathered this map of zone IDs from the AWS doc in previous comments.

@stack72
Copy link
Contributor

stack72 commented Jan 4, 2017

Hi all

this is correct - in order to attempt to normalize the Alias records, we were dropping the dualstack from the start of it. A dualstack record has a different hosted zone id. I am on the path to rectifying this right now

Thanks for the understanding and apologies for the time taken here - I just opened a PR for a datasource that will do what @nckslvrmn is doing above

Almost ready here

@nckslvrmn
Copy link

nckslvrmn commented Jan 26, 2017

Thanks @stack72. I also noticed that the casing change is still happening. So my records are all changing on every plan and apply still as shown below:

~ aws_route53_record.test
    alias.number.evaluate_target_health: "false" => "false"
    alias.number.name:                   "vpc-test-blah.us-east-1.elb.amazonaws.com" => ""
    alias.number.zone_id:                "Z35SXDOTRQ7X7K" => ""
    alias.number.evaluate_target_health: "" => "false"
    alias.number.name:                   "" => "VPC-Test-Blah.us-east-1.elb.amazonaws.com"
    alias.number.zone_id:                "" => "Z35SXDOTRQ7X7K"

@CpuID
Copy link
Contributor

CpuID commented Feb 14, 2017

Looking forward to this being resolved... :) Has been there a little while now.

@mattclegg
Copy link
Contributor

It looks like you can now use aws_elb_hosted_zone_id.

See; #11027

@stefansundin
Copy link
Contributor

stefansundin commented May 3, 2017

I have the same problem for a couple of ELB's in us-west-1. These ELBs were all created in April and May of 2016, and they were given a CanonicalHostedZoneNameID of Z1M58G0W56PQJA.

Example:

$ aws elb describe-load-balancers --region us-west-1 --query 'LoadBalancerDescriptions[0].[CanonicalHostedZoneNameID,CreatedTime]' --output text --load-balancer-names my-elb
Z1M58G0W56PQJA  2016-04-25T22:23:12.710Z

However, ELBs that were created in June 2016 have a different CanonicalHostedZoneNameID:

$ aws elb describe-load-balancers --region us-west-1 --query 'LoadBalancerDescriptions[0].[CanonicalHostedZoneNameID,CreatedTime]' --output text --load-balancer-names another-elb
Z368ELLRRE2KJ0  2016-06-14T03:17:22.060Z

This id corresponds to AWS's Route53 Zone ID published here: https://docs.aws.amazon.com/general/latest/gr/rande.html#elb_region

(Because the zone id for ELBs are publicly available there, there is no need for you to censor it. It only reveals in what region your ELB is in.)

My guess is that AWS for some reason switched from Z1M58G0W56PQJA to Z368ELLRRE2KJ0 starting around June 2016. For what reason, I have no clue. I tried using archive.org to get the old zone id on that page, but at that time they didn't publish that information: https://web-beta.archive.org/web/20160626023414/https://docs.aws.amazon.com/general/latest/gr/rande.html#elb_region

But if you google "Z1M58G0W56PQJA", you get a lot of results.

Anyway, to save customers from issues, AWS will transparently translate Z1M58G0W56PQJA to Z368ELLRRE2KJ0 behind the scenes when you create an alias record. I have found that they do this for other types of API calls too, so it seems to be a somewhat common (but undocumented) practice.

It would have been nice if AWS had updated the CanonicalHostedZoneNameID property on all old ELBs, but maybe that would have the chance of disrupting things even more.

Have anyone tried opening a ticket with AWS to see if they can update this behind the scenes?

The best solution is probably to use the aws_elb_hosted_zone_id datasource that @stack72 created: #11027

Another fix might be to make Terraform ignore changing the zone_id in certain cases. There would have to be a hard-coded list of old zone ids. Not sure if it's worth the effort though.

I tried putting:

  lifecycle {
    ignore_changes = ["alias"]
  }

in my aws_route53_record, and it worked, but it's a bad solution since any update to any of the other attributes such as name are ignored too.

The final option is to re-create your ELB.

Edit: I made a terrible proof of concept for the fix I suggested above: https://github.com/stefansundin/terraform/commit/549db9ff1a4a7f795d4eb25ee2a73bb23197d969, and it works :) But it's still not pretty.

@ctso
Copy link

ctso commented Mar 14, 2018

Have anyone tried opening a ticket with AWS to see if they can update this behind the scenes?

I have, yes. The response I got was basically that "things changed" and older ELBs will return an older CanonicalHostedZoneNameID, while newer ones will be correct. It seems they are rewriting the HostedZoneId when creating the ALIAS record.

Unfortunately this means we're also looking at using ignore_changes as well, which is far less than ideal. Unfortunately recreating the ELBs is easier said than done in some cases.

@ghost
Copy link

ghost commented Apr 1, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests