Skip to content

Commit

Permalink
EMR AL2 refactor
Browse files Browse the repository at this point in the history
  • Loading branch information
skpathak2 committed Apr 2, 2021
1 parent f029abf commit ffd536d
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 6 deletions.
18 changes: 17 additions & 1 deletion EMR/Assign_Private_IP/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,22 @@ This is a Python script that can be used as a bootstrap action or as an EMR step

What does this script do:

- takes the private IP address as its argument
- takes the private IP address and region as its argument
- associate that IP to the eth0 interface of the master node
- setup the necessary network configuration to ensure that all the traffic is redirected from the secondary to the primary IP address

Steps to execute this script:

- Please confirm that your AWS Identity and Access Management (IAM) policy allows permissions for EMR_EC2_DefaultRole and ec2:AssignPrivateIpAddresses.
- Download the assign_private_ip_region.py script from awslabs github repo.
- Save the script in an Amazon Simple Storage Service (Amazon S3) bucket.
- Specify the script as a [custom bootstrap action](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-bootstrap.html#bootstrapCustom) while launching an Amazon EMR cluster. You can also run the script as an [Amazon EMR step](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-work-with-steps.html#emr-add-steps). The script requires an argument, which is a private IP address from the CIDR range of your subnet and the region. The script attaches that private IP address to the network interface (eth0) of the master node. The script also configures the network settings to redirect all traffic from the secondary IP address to the primary IP address.
eg
- From bash shell (on master node)
s3://<bucekt>/assign_private_ip.py 172.31.45.7 us-east-1

- Using BA
Script location:- s3://<s3 bucekt>/assign_private_ip.py
Optional arguments:- 172.31.45.13 us-east-1

- To find the new IP address, open the Amazon Elastic Compute Cloud (Amazon EC2) console. Then, select the EC2 instance that's acting as the master node of the EMR cluster. The new IP address appears on the Description tab, in the Secondary private IPs field.
11 changes: 6 additions & 5 deletions EMR/Assign_Private_IP/assign_private_ip.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,22 +17,23 @@

if is_master == "true":
private_ip = str(sys.argv[1])
region = str(sys.argv[2])
instance_id = subprocess.check_output(['/usr/bin/curl -s http://169.254.169.254/latest/meta-data/instance-id'], shell=True)
interface_id = subprocess.check_output(['aws ec2 describe-instances --instance-ids %s | jq .Reservations[].Instances[].NetworkInterfaces[].NetworkInterfaceId' % instance_id], shell=True).strip().strip('"')
interface_id = subprocess.check_output(['aws ec2 describe-instances --region %s --instance-ids %s | jq .Reservations[].Instances[].NetworkInterfaces[].NetworkInterfaceId' % (region, instance_id)], shell=True).strip().strip('"')

#Assign private IP to the master instance:
subprocess.check_call(['aws ec2 assign-private-ip-addresses --network-interface-id %s --private-ip-addresses %s' % (interface_id, private_ip)], shell=True)
subprocess.check_call(['aws ec2 assign-private-ip-addresses --region %s --network-interface-id %s --private-ip-addresses %s' % (region, interface_id, private_ip)], shell=True)

subnet_id = subprocess.check_output(['aws ec2 describe-instances --instance-ids %s | jq .Reservations[].Instances[].NetworkInterfaces[].SubnetId' % instance_id], shell=True).strip().strip('"').strip().strip('"')
subnet_id = subprocess.check_output(['aws ec2 describe-instances --region %s --instance-ids %s | jq .Reservations[].Instances[].NetworkInterfaces[].SubnetId' % (region, instance_id)], shell=True).strip().strip('"').strip().strip('"')

subnet_cidr = subprocess.check_output(['aws ec2 describe-subnets --subnet-ids %s | jq .Subnets[].CidrBlock' % subnet_id], shell=True).strip().strip('"')
subnet_cidr = subprocess.check_output(['aws ec2 describe-subnets --region %s --subnet-ids %s | jq .Subnets[].CidrBlock' % (region, subnet_id)], shell=True).strip().strip('"')
cidr_prefix = subnet_cidr.split("/")[1]

#Add the private IP address to the default network interface:
subprocess.check_call(['sudo ip addr add dev eth0 %s/%s' % (private_ip, cidr_prefix)], shell=True)

#Configure iptablles rules such that traffic is redirected from the secondary to the primary IP address:
primary_ip = subprocess.check_output(['/sbin/ifconfig eth0 | grep \'inet addr:\' | cut -d: -f2 | awk \'{ print $1}\''], shell=True).strip()
primary_ip = subprocess.check_output(['/usr/bin/curl -s http://169.254.169.254/latest/meta-data/local-ipv4'], shell=True, universal_newlines=True).strip()
subprocess.check_call(['sudo iptables -t nat -A PREROUTING -d %s -j DNAT --to-destination %s' % (private_ip, primary_ip)], shell=True)
else:
print "Not the master node"

0 comments on commit ffd536d

Please sign in to comment.