Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gangams/aad stage3 msi auth #585

Merged
merged 32 commits into from
Jul 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
cacf15a
changes related to aad msi auth feature
ganga1980 Jun 14, 2021
679038b
use existing envvars
ganga1980 Jun 14, 2021
b0730e8
fix imds token expiry interval
ganga1980 Jun 14, 2021
e2ad0a3
refactor the windows agent ingestion token code
ganga1980 Jun 15, 2021
c206553
code cleanup
ganga1980 Jun 15, 2021
5d1ee06
fix build errors
ganga1980 Jun 15, 2021
3586e3d
code clean up
ganga1980 Jun 15, 2021
bdb7037
code clean up
ganga1980 Jun 15, 2021
ef73233
code clean up
ganga1980 Jun 15, 2021
bed0042
code clean up
ganga1980 Jun 15, 2021
6ba4320
Merge branch 'ci_dev' into gangams/aad-stag3-msi-auth
ganga1980 Jun 15, 2021
e44ea9b
more refactoring
ganga1980 Jun 15, 2021
69128b7
fix bug
ganga1980 Jun 15, 2021
b1ddedf
fix bug
ganga1980 Jun 15, 2021
d8df6b5
add debug logs
ganga1980 Jun 16, 2021
ffc5c20
add nil checks
ganga1980 Jun 16, 2021
6faa97f
revert changes
ganga1980 Jun 16, 2021
ce9c483
revert yaml change since this added in aks side
ganga1980 Jun 16, 2021
46d1973
fix pr feedback
ganga1980 Jun 18, 2021
8091e55
fix pr feedback
ganga1980 Jun 19, 2021
1eff4c5
refine retry code
ganga1980 Jun 19, 2021
a99597d
update mdsd env as per official build
ganga1980 Jun 23, 2021
72bcfff
cleanup
ganga1980 Jun 23, 2021
130d0ca
update env vars per mdsd
ganga1980 Jun 23, 2021
d4ab072
update with mdsd official build
ganga1980 Jun 24, 2021
38eb5ea
Merge branch 'ci_dev' into gangams/aad-stag3-msi-auth
ganga1980 Jun 24, 2021
9a2a94d
skip cert gen & renewal incase of aad msi auth
ganga1980 Jun 29, 2021
6562c3b
Merge branch 'ci_dev' into gangams/aad-stag3-msi-auth
ganga1980 Jul 1, 2021
02374ff
add nil check
ganga1980 Jul 8, 2021
09ba05b
cherry windows agent nodeip issue
rashmichandrashekar Jul 14, 2021
18a2ae4
merge latest ci_dev code
ganga1980 Jul 15, 2021
a26b841
fix merge issue
ganga1980 Jul 16, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions build/linux/installer/datafiles/base_container.data
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,9 @@ MAINTAINER: 'Microsoft Corporation'

/etc/fluent/plugin/omslog.rb; source/plugins/utils/omslog.rb; 644; root; root
/etc/fluent/plugin/oms_common.rb; source/plugins/utils/oms_common.rb; 644; root; root
/etc/fluent/plugin/extension.rb; source/plugins/utils/extension.rb; 644; root; root
/etc/fluent/plugin/extension_utils.rb; source/plugins/utils/extension_utils.rb; 644; root; root


/etc/fluent/kube.conf; build/linux/installer/conf/kube.conf; 644; root; root
/etc/fluent/container.conf; build/linux/installer/conf/container.conf; 644; root; root
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,18 @@ def start

def enumerate
begin
puts "Calling certificate renewal code..."
maintenance = OMS::OnboardingHelper.new(
ENV["WSID"],
ENV["DOMAIN"],
ENV["CI_AGENT_GUID"]
)
ret_code = maintenance.register_certs()
puts "Return code from register certs : #{ret_code}"
if !ENV["AAD_MSI_AUTH_MODE"].nil? && !ENV["AAD_MSI_AUTH_MODE"].empty? && ENV["AAD_MSI_AUTH_MODE"].downcase == "true"
puts "skipping certificate renewal code since AAD MSI auth configured"
else
puts "Calling certificate renewal code..."
maintenance = OMS::OnboardingHelper.new(
ENV["WSID"],
ENV["DOMAIN"],
ENV["CI_AGENT_GUID"]
)
ret_code = maintenance.register_certs()
puts "Return code from register certs : #{ret_code}"
end
rescue => errorStr
puts "in_heartbeat_request::enumerate:Failed in enumerate: #{errorStr}"
# STDOUT telemetry should alredy be going to Traces in AI.
Expand Down
112 changes: 72 additions & 40 deletions kubernetes/linux/main.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ waitforlisteneronTCPport() {
echo "${FUNCNAME[0]} called with incorrect arguments<$1 , $2>. Required arguments <#port, #wait-time-in-seconds>"
return -1
else

if [[ $port =~ $numeric ]] && [[ $waittimesecs =~ $numeric ]]; then
#local varlistener=$(netstat -lnt | awk '$6 == "LISTEN" && $4 ~ ":25228$"')
while true
Expand Down Expand Up @@ -57,7 +57,11 @@ else
export customResourceId=$AKS_RESOURCE_ID
echo "export customResourceId=$AKS_RESOURCE_ID" >> ~/.bashrc
source ~/.bashrc
echo "customResourceId:$customResourceId"
echo "customResourceId:$customResourceId"
export customRegion=$AKS_REGION
echo "export customRegion=$AKS_REGION" >> ~/.bashrc
source ~/.bashrc
echo "customRegion:$customRegion"
fi

#set agent config schema version
Expand Down Expand Up @@ -194,9 +198,15 @@ fi
if [ -z $domain ]; then
ClOUD_ENVIRONMENT="unknown"
elif [ $domain == "opinsights.azure.com" ]; then
CLOUD_ENVIRONMENT="public"
else
CLOUD_ENVIRONMENT="national"
CLOUD_ENVIRONMENT="azurepubliccloud"
elif [ $domain == "opinsights.azure.cn" ]; then
CLOUD_ENVIRONMENT="azurechinacloud"
elif [ $domain == "opinsights.azure.us" ]; then
CLOUD_ENVIRONMENT="azureusgovernmentcloud"
elif [ $domain == "opinsights.azure.eaglex.ic.gov" ]; then
CLOUD_ENVIRONMENT="usnat"
elif [ $domain == "opinsights.azure.microsoft.scloud" ]; then
CLOUD_ENVIRONMENT="ussec"
fi
export CLOUD_ENVIRONMENT=$CLOUD_ENVIRONMENT
echo "export CLOUD_ENVIRONMENT=$CLOUD_ENVIRONMENT" >> ~/.bashrc
Expand Down Expand Up @@ -233,9 +243,9 @@ if [ ${#APPLICATIONINSIGHTS_AUTH_URL} -ge 1 ]; then # (check if APPLICATIONINSI
fi


aikey=$(echo $APPLICATIONINSIGHTS_AUTH | base64 --decode)
export TELEMETRY_APPLICATIONINSIGHTS_KEY=$aikey
echo "export TELEMETRY_APPLICATIONINSIGHTS_KEY=$aikey" >> ~/.bashrc
aikey=$(echo $APPLICATIONINSIGHTS_AUTH | base64 --decode)
export TELEMETRY_APPLICATIONINSIGHTS_KEY=$aikey
echo "export TELEMETRY_APPLICATIONINSIGHTS_KEY=$aikey" >> ~/.bashrc

source ~/.bashrc

Expand Down Expand Up @@ -421,7 +431,7 @@ export KUBELET_RUNTIME_OPERATIONS_ERRORS_METRIC="kubelet_docker_operations_error
if [ "$CONTAINER_RUNTIME" != "docker" ]; then
# these metrics are avialble only on k8s versions <1.18 and will get deprecated from 1.18
export KUBELET_RUNTIME_OPERATIONS_METRIC="kubelet_runtime_operations"
export KUBELET_RUNTIME_OPERATIONS_ERRORS_METRIC="kubelet_runtime_operations_errors"
export KUBELET_RUNTIME_OPERATIONS_ERRORS_METRIC="kubelet_runtime_operations_errors"
fi

echo "set caps for ruby process to read container env from proc"
Expand All @@ -445,34 +455,56 @@ DOCKER_CIMPROV_VERSION=$(dpkg -l | grep docker-cimprov | awk '{print $3}')
echo "DOCKER_CIMPROV_VERSION=$DOCKER_CIMPROV_VERSION"
export DOCKER_CIMPROV_VERSION=$DOCKER_CIMPROV_VERSION
echo "export DOCKER_CIMPROV_VERSION=$DOCKER_CIMPROV_VERSION" >> ~/.bashrc
echo "*** activating oneagent in legacy auth mode ***"
CIWORKSPACE_id="$(cat /etc/omsagent-secret/WSID)"
#use the file path as its secure than env
CIWORKSPACE_keyFile="/etc/omsagent-secret/KEY"
cat /etc/mdsd.d/envmdsd | while read line; do
echo $line >> ~/.bashrc
done
source /etc/mdsd.d/envmdsd
echo "setting mdsd workspaceid & key for workspace:$CIWORKSPACE_id"
export CIWORKSPACE_id=$CIWORKSPACE_id
echo "export CIWORKSPACE_id=$CIWORKSPACE_id" >> ~/.bashrc
export CIWORKSPACE_keyFile=$CIWORKSPACE_keyFile
echo "export CIWORKSPACE_keyFile=$CIWORKSPACE_keyFile" >> ~/.bashrc
export OMS_TLD=$domain
echo "export OMS_TLD=$OMS_TLD" >> ~/.bashrc
export MDSD_FLUENT_SOCKET_PORT="29230"
echo "export MDSD_FLUENT_SOCKET_PORT=$MDSD_FLUENT_SOCKET_PORT" >> ~/.bashrc

#skip imds lookup since not used in legacy auth path
#skip imds lookup since not used either legacy or aad msi auth path
export SKIP_IMDS_LOOKUP_FOR_LEGACY_AUTH="true"
echo "export SKIP_IMDS_LOOKUP_FOR_LEGACY_AUTH=$SKIP_IMDS_LOOKUP_FOR_LEGACY_AUTH" >> ~/.bashrc

# this used by mdsd to determine cloud specific LA endpoints
export OMS_TLD=$domain
echo "export OMS_TLD=$OMS_TLD" >> ~/.bashrc
cat /etc/mdsd.d/envmdsd | while read line; do
echo $line >> ~/.bashrc
done
source /etc/mdsd.d/envmdsd
MDSD_AAD_MSI_AUTH_ARGS=""
# check if its AAD Auth MSI mode via USING_AAD_MSI_AUTH
export AAD_MSI_AUTH_MODE=false
if [ "${USING_AAD_MSI_AUTH}" == "true" ]; then
echo "*** activating oneagent in aad auth msi mode ***"
# msi auth specific args
MDSD_AAD_MSI_AUTH_ARGS="-a -A"
export AAD_MSI_AUTH_MODE=true
ganga1980 marked this conversation as resolved.
Show resolved Hide resolved
echo "export AAD_MSI_AUTH_MODE=true" >> ~/.bashrc
# this used by mdsd to determine the cloud specific AMCS endpoints
export customEnvironment=$CLOUD_ENVIRONMENT
echo "export customEnvironment=$customEnvironment" >> ~/.bashrc
export MDSD_FLUENT_SOCKET_PORT="28230"
echo "export MDSD_FLUENT_SOCKET_PORT=$MDSD_FLUENT_SOCKET_PORT" >> ~/.bashrc
ganga1980 marked this conversation as resolved.
Show resolved Hide resolved
export ENABLE_MCS="true"
echo "export ENABLE_MCS=$ENABLE_MCS" >> ~/.bashrc
export MONITORING_USE_GENEVA_CONFIG_SERVICE="false"
echo "export MONITORING_USE_GENEVA_CONFIG_SERVICE=$MONITORING_USE_GENEVA_CONFIG_SERVICE" >> ~/.bashrc
export MDSD_USE_LOCAL_PERSISTENCY="false"
echo "export MDSD_USE_LOCAL_PERSISTENCY=$MDSD_USE_LOCAL_PERSISTENCY" >> ~/.bashrc
else
echo "*** activating oneagent in legacy auth mode ***"
CIWORKSPACE_id="$(cat /etc/omsagent-secret/WSID)"
#use the file path as its secure than env
CIWORKSPACE_keyFile="/etc/omsagent-secret/KEY"
echo "setting mdsd workspaceid & key for workspace:$CIWORKSPACE_id"
export CIWORKSPACE_id=$CIWORKSPACE_id
echo "export CIWORKSPACE_id=$CIWORKSPACE_id" >> ~/.bashrc
export CIWORKSPACE_keyFile=$CIWORKSPACE_keyFile
echo "export CIWORKSPACE_keyFile=$CIWORKSPACE_keyFile" >> ~/.bashrc
export MDSD_FLUENT_SOCKET_PORT="29230"
echo "export MDSD_FLUENT_SOCKET_PORT=$MDSD_FLUENT_SOCKET_PORT" >> ~/.bashrc
ganga1980 marked this conversation as resolved.
Show resolved Hide resolved
fi
source ~/.bashrc

dpkg -l | grep mdsd | awk '{print $2 " " $3}'

if [ "${CONTAINER_TYPE}" == "PrometheusSidecar" ]; then
echo "starting mdsd with mdsd-port=26130, fluentport=26230 and influxport=26330 in legacy auth mode in sidecar container..."
if [ "${CONTAINER_TYPE}" == "PrometheusSidecar" ]; then
echo "starting mdsd with mdsd-port=26130, fluentport=26230 and influxport=26330 in sidecar container..."
#use tenant name to avoid unix socket conflict and different ports for port conflict
#roleprefix to use container specific mdsd socket
export TENANT_NAME="${CONTAINER_TYPE}"
Expand All @@ -482,23 +514,23 @@ if [ "${CONTAINER_TYPE}" == "PrometheusSidecar" ]; then
source ~/.bashrc
mkdir /var/run/mdsd-${CONTAINER_TYPE}
# add -T 0xFFFF for full traces
mdsd -r ${MDSD_ROLE_PREFIX} -p 26130 -f 26230 -i 26330 -e ${MDSD_LOG}/mdsd.err -w ${MDSD_LOG}/mdsd.warn -o ${MDSD_LOG}/mdsd.info -q ${MDSD_LOG}/mdsd.qos &
else
echo "starting mdsd in legacy auth mode in main container..."
# add -T 0xFFFF for full traces
mdsd -e ${MDSD_LOG}/mdsd.err -w ${MDSD_LOG}/mdsd.warn -o ${MDSD_LOG}/mdsd.info -q ${MDSD_LOG}/mdsd.qos &
mdsd ${MDSD_AAD_MSI_AUTH_ARGS} -r ${MDSD_ROLE_PREFIX} -p 26130 -f 26230 -i 26330 -e ${MDSD_LOG}/mdsd.err -w ${MDSD_LOG}/mdsd.warn -o ${MDSD_LOG}/mdsd.info -q ${MDSD_LOG}/mdsd.qos &
else
echo "starting mdsd mode in main container..."
# add -T 0xFFFF for full traces
mdsd ${MDSD_AAD_MSI_AUTH_ARGS} -e ${MDSD_LOG}/mdsd.err -w ${MDSD_LOG}/mdsd.warn -o ${MDSD_LOG}/mdsd.info -q ${MDSD_LOG}/mdsd.qos &
fi

# no dependency on fluentd for prometheus side car container
if [ "${CONTAINER_TYPE}" != "PrometheusSidecar" ]; then
# no dependency on fluentd for prometheus side car container
if [ "${CONTAINER_TYPE}" != "PrometheusSidecar" ]; then
if [ ! -e "/etc/config/kube.conf" ]; then
echo "*** starting fluentd v1 in daemonset"
fluentd -c /etc/fluent/container.conf -o /var/opt/microsoft/docker-cimprov/log/fluentd.log --log-rotate-age 5 --log-rotate-size 20971520 &
else
echo "*** starting fluentd v1 in replicaset"
fluentd -c /etc/fluent/kube.conf -o /var/opt/microsoft/docker-cimprov/log/fluentd.log --log-rotate-age 5 --log-rotate-size 20971520 &
fi
fi
fi
fi

#If config parsing was successful, a copy of the conf file with replaced custom settings file is created
if [ ! -e "/etc/config/kube.conf" ]; then
Expand Down Expand Up @@ -635,7 +667,7 @@ echo "getting rsyslog status..."
service rsyslog status

shutdown() {
pkill -f mdsd
pkill -f mdsd
}

trap "shutdown" SIGTERM
Expand Down
4 changes: 2 additions & 2 deletions kubernetes/linux/setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen && \
dpkg-reconfigure --frontend=noninteractive locales && \
update-locale LANG=en_US.UTF-8

#install oneagent - Official bits (05/17/2021)
wget https://github.com/microsoft/Docker-Provider/releases/download/05172021-oneagent/azure-mdsd_1.10.1-build.master.213_x86_64.deb
#install oneagent - Official bits (06/24/2021)
wget https://github.com/microsoft/Docker-Provider/releases/download/06242021-oneagent/azure-mdsd_1.10.3-build.master.241_x86_64.deb

/usr/bin/dpkg -i $TMPDIR/azure-mdsd*.deb
cp -f $TMPDIR/mdsd.xml /etc/mdsd.d
Expand Down
72 changes: 66 additions & 6 deletions kubernetes/windows/main.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -43,17 +43,49 @@ function Start-FileSystemWatcher {

function Set-EnvironmentVariables {
$domain = "opinsights.azure.com"
$cloud_environment = "public"
$mcs_endpoint = "monitor.azure.com"
$cloud_environment = "azurepubliccloud"
if (Test-Path /etc/omsagent-secret/DOMAIN) {
# TODO: Change to omsagent-secret before merging
$domain = Get-Content /etc/omsagent-secret/DOMAIN
$cloud_environment = "national"
if (![string]::IsNullOrEmpty($domain)) {
if ($domain -eq "opinsights.azure.com") {
$cloud_environment = "azurepubliccloud"
$mcs_endpoint = "monitor.azure.com"
} elseif ($domain -eq "opinsights.azure.cn") {
$cloud_environment = "azurechinacloud"
$mcs_endpoint = "monitor.azure.cn"
} elseif ($domain -eq "opinsights.azure.us") {
$cloud_environment = "azureusgovernmentcloud"
$mcs_endpoint = "monitor.azure.us"
} elseif ($domain -eq "opinsights.azure.eaglex.ic.gov") {
$cloud_environment = "usnat"
$mcs_endpoint = "monitor.azure.eaglex.ic.gov"
} elseif ($domain -eq "opinsights.azure.microsoft.scloud") {
$cloud_environment = "ussec"
$mcs_endpoint = "monitor.azure.microsoft.scloud"
} else {
Write-Host "Invalid or Unsupported domain name $($domain). EXITING....."
exit 1
}
} else {
Write-Host "Domain name either null or empty. EXITING....."
exit 1
}
}

Write-Host "Log analytics domain: $($domain)"
Write-Host "MCS endpoint: $($mcs_endpoint)"
Write-Host "Cloud Environment: $($cloud_environment)"

# Set DOMAIN
[System.Environment]::SetEnvironmentVariable("DOMAIN", $domain, "Process")
[System.Environment]::SetEnvironmentVariable("DOMAIN", $domain, "Machine")

# Set MCS Endpoint
[System.Environment]::SetEnvironmentVariable("MCS_ENDPOINT", $mcs_endpoint, "Process")
[System.Environment]::SetEnvironmentVariable("MCS_ENDPOINT", $mcs_endpoint, "Machine")

# Set CLOUD_ENVIRONMENT
[System.Environment]::SetEnvironmentVariable("CLOUD_ENVIRONMENT", $cloud_environment, "Process")
[System.Environment]::SetEnvironmentVariable("CLOUD_ENVIRONMENT", $cloud_environment, "Machine")
Expand Down Expand Up @@ -158,7 +190,7 @@ function Set-EnvironmentVariables {
Write-Host $_.Exception
}
}

# Check if the fetched IKey was properly encoded. if not then turn off telemetry
if ($aiKeyFetched -match '^[A-Za-z0-9=]+$') {
Write-Host "Using cloud-specific instrumentation key"
Expand Down Expand Up @@ -229,6 +261,21 @@ function Set-EnvironmentVariables {
Write-Host "Failed to set environment variable HOSTNAME for target 'machine' since it is either null or empty"
}

# check if its AAD Auth MSI mode via USING_AAD_MSI_AUTH environment variable
$isAADMSIAuth = [System.Environment]::GetEnvironmentVariable("USING_AAD_MSI_AUTH", "process")
if (![string]::IsNullOrEmpty($isAADMSIAuth)) {
[System.Environment]::SetEnvironmentVariable("AAD_MSI_AUTH_MODE", $isAADMSIAuth, "Process")
[System.Environment]::SetEnvironmentVariable("AAD_MSI_AUTH_MODE", $isAADMSIAuth, "Machine")
Write-Host "Successfully set environment variable AAD_MSI_AUTH_MODE - $($isAADMSIAuth) for target 'machine'..."
}

# check if use token proxy endpoint set via USE_IMDS_TOKEN_PROXY_END_POINT environment variable
$useIMDSTokenProxyEndpoint = [System.Environment]::GetEnvironmentVariable("USE_IMDS_TOKEN_PROXY_END_POINT", "process")
if (![string]::IsNullOrEmpty($useIMDSTokenProxyEndpoint)) {
[System.Environment]::SetEnvironmentVariable("USE_IMDS_TOKEN_PROXY_END_POINT", $useIMDSTokenProxyEndpoint, "Process")
[System.Environment]::SetEnvironmentVariable("USE_IMDS_TOKEN_PROXY_END_POINT", $useIMDSTokenProxyEndpoint, "Machine")
Write-Host "Successfully set environment variable USE_IMDS_TOKEN_PROXY_END_POINT - $($useIMDSTokenProxyEndpoint) for target 'machine'..."
}
$nodeIp = [System.Environment]::GetEnvironmentVariable("NODE_IP", "process")
if (![string]::IsNullOrEmpty($nodeIp)) {
[System.Environment]::SetEnvironmentVariable("NODE_IP", $nodeIp, "machine")
Expand Down Expand Up @@ -427,7 +474,15 @@ function Start-Telegraf {
else {
Write-Host "Failed to set environment variable KUBERNETES_SERVICE_PORT for target 'machine' since it is either null or empty"
}

$nodeIp = [System.Environment]::GetEnvironmentVariable("NODE_IP", "process")
if (![string]::IsNullOrEmpty($nodeIp)) {
[System.Environment]::SetEnvironmentVariable("NODE_IP", $nodeIp, "machine")
Write-Host "Successfully set environment variable NODE_IP - $($nodeIp) for target 'machine'..."
}
else {
Write-Host "Failed to set environment variable NODE_IP for target 'machine' since it is either null or empty"
}

Write-Host "Installing telegraf service"
C:\opt\telegraf\telegraf.exe --service install --config "C:\etc\telegraf\telegraf.conf"

Expand Down Expand Up @@ -524,8 +579,13 @@ if (![string]::IsNullOrEmpty($requiresCertBootstrap) -and `
Bootstrap-CACertificates
}

Generate-Certificates
Test-CertificatePath
$isAADMSIAuth = [System.Environment]::GetEnvironmentVariable("USING_AAD_MSI_AUTH")
if (![string]::IsNullOrEmpty($isAADMSIAuth) -and $isAADMSIAuth.ToLower() -eq 'true') {
Write-Host "skipping agent onboarding via cert since AAD MSI Auth configured"
} else {
Generate-Certificates
Test-CertificatePath
}
Start-Fluent-Telegraf

# List all powershell processes running. This should have main.ps1 and filesystemwatcher.ps1
Expand Down
Loading