Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark-5246 resolving hostname #91

Merged
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions resolve-hostname/setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets rename this to something like resolve-hostname.sh or setup-hostname.sh in the top-level directory


# Starting new instance in VPC often results that `hostname` returns something like 'ip-10-1-1-24', which is
# not resolvable. Which leads to problems like SparkUI failing to bind itself on start up to that hostname as
# described in https://issues.apache.org/jira/browse/SPARK-5246.
# This script maps private ip to such hostname via '/etc/hosts'.
#

#Are we in VPC?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits: Space after # (for all the comments)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed and pushed

MAC=`wget -q -O - http://169.254.169.254/latest/meta-data/mac`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MAC doesn't seem to be used ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used on line 11

VCP_ID=`wget -q -O - http://169.254.169.254/latest/meta-data/network/interfaces/macs/${MAC}/vpc-id`
if [ -n "${VCP_ID}" ]; then
#echo "nothing to do - instance is not in VPC"
exit 0
fi

SHORT_HOSTNAME=`hostname`

PRIVATE_IP=`wget -q -O - http://169.254.169.254/latest/meta-data/local-ipv4`

# do changes only if short hostname does not resolve
if ( ! ping -c 1 -q "${SHORT_HOSTNAME}" > /dev/null 2>&1 ); then
echo -e "\n ${PRIVATE_IP} ${SHORT_HOSTNAME}\n" >> /etc/hosts

#let's make sure that it got fixed
if ( ! ping -c 1 -q "${SHORT_HOSTNAME}" > /dev/null 2>&1 ); then
#return some non-zero code to indicate problem
echo "Possible bug: unable to fix resolution of local hostname"
return 62
fi

fi
2 changes: 1 addition & 1 deletion setup-slave.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ source ec2-variables.sh

# Set hostname based on EC2 private DNS name, so that it is set correctly
# even if the instance is restarted with a different private DNS name
PRIVATE_DNS=`wget -q -O - http://instance-data.ec2.internal/latest/meta-data/local-hostname`
PRIVATE_DNS=`wget -q -O - http://169.254.169.254/latest/meta-data/local-hostname`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to invoke the new setup-hostname script here -- You can assume that the spark-ec2 directory exists, so just adding a line like bash /root/spark-ec2/setup-hostname.sh should be sufficient

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about creation of "module" and putting it in the list of modules in spark-ec2.py before "spark" module.

It might be better to do it your way.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - modules are more appropriate for new packages or something like that. For things like fixing hostnames we can just put it in the top level directory

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it to setup-slave.sh as you suggested. Should it also be invoked from setup.sh?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - setup-slave.sh is invoked on all machines (including the master node). So this should be enough

hostname $PRIVATE_DNS
echo $PRIVATE_DNS > /etc/hostname
HOSTNAME=$PRIVATE_DNS # Fix the bash built-in hostname variable too
Expand Down
4 changes: 2 additions & 2 deletions setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ source ec2-variables.sh

# Set hostname based on EC2 private DNS name, so that it is set correctly
# even if the instance is restarted with a different private DNS name
PRIVATE_DNS=`wget -q -O - http://instance-data.ec2.internal/latest/meta-data/local-hostname`
PUBLIC_DNS=`wget -q -O - http://instance-data.ec2.internal/latest/meta-data/hostname`
PRIVATE_DNS=`wget -q -O - http://169.254.169.254/latest/meta-data/local-hostname`
PUBLIC_DNS=`wget -q -O - http://169.254.169.254/latest/meta-data/hostname`
hostname $PRIVATE_DNS
echo $PRIVATE_DNS > /etc/hostname
export HOSTNAME=$PRIVATE_DNS # Fix the bash built-in hostname variable too
Expand Down