-
Notifications
You must be signed in to change notification settings - Fork 753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add introspection service and a script to collect support bundle #8
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to run gofmt on this.
ipamd/datastore/data_store.go
Outdated
@@ -90,7 +107,7 @@ func NewDataStore() *DataStore { | |||
} | |||
|
|||
// AddENI add eni to data store | |||
func (ds *DataStore) AddENI(eniID string, deviceNumber int, isPrimary bool) error { | |||
func (ds *DataStore) AddENI(eniID string, DeviceNumber int, IsPrimary bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we Capitalizing function args?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will change it back to lowercase
ipamd/datastore/data_store.go
Outdated
var eniInfos = ENIInfos{ | ||
TotalIPs: ds.total, | ||
AssignedIPs: ds.assigned, | ||
ENIIPPools: make(map[string]ENIIPPool), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make(map[string]ENIIPPool, len(ds.eniIPPools))
will give the right initial size and avoid reallocations as you copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. thanks
ipamd/datastore/data_store.go
Outdated
// AssignedIPv4Addresses is the number of ip addesses already been assigned | ||
AssignedIPv4Addresses int | ||
// IPv4Addresses shows whether each address is assigned | ||
IPv4Addresses map[string]*AddressInfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume these name changes are to export the struct members for visibility in the introspection package; why exclude the time fields then? Also the nice formatting alignment got messed up. Oh, and please consistently capitalize initialisms ENI,IP,etc in the comments.
ipamd/datastore/data_store.go
Outdated
DeviceNumber int | ||
// AssignedIPv4Addresses is the number of ip addesses already been assigned | ||
AssignedIPv4Addresses int | ||
// IPv4Addresses shows whether each address is assigned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the key format
ipamd/datastore/data_store.go
Outdated
// AssignedIPv4Addresses is the number of ip addesses already been assigned | ||
AssignedIPv4Addresses int | ||
// IPv4Addresses shows whether each address is assigned | ||
IPv4Addresses map[string]*AddressInfo | ||
} | ||
|
||
// AddressInfo contains inforation about an IP | ||
type AddressInfo struct { | ||
address string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the form of this string. Is it redundant with the key of the map to these structs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The form of this string is the dot notation of IP address. Making address as part of AddressInfo struct can make AddressInfo re-used by other part of code,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The form should be documented in the code; suggest: "IP address strings must be in dot-decimal notation with no leading zeroes and no whitespace (eg: '10.1.0.253')"
Note: there was an attempt to standardize text representation via IETF RFC, though it was never formally accepted. As you can see from that doc, there is historical ambiguity in how to represent IP addresses as text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
ipamd/datastore/data_store.go
Outdated
ds.lock.Lock() | ||
defer ds.lock.Unlock() | ||
|
||
var podInfos = make(map[string]PodIPInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make(map[string]PodIPInfo, len(ds.podsIP))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ipamd/introspect.go
Outdated
return func(w http.ResponseWriter, r *http.Request) { | ||
responseJSON, err := json.Marshal(ipam.dataStore.GetENIInfos()) | ||
if err != nil { | ||
log.Error("Failed to introspect eni data: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Failed to marshal ENI data"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ipamd/introspect.go
Outdated
return func(w http.ResponseWriter, r *http.Request) { | ||
responseJSON, err := json.Marshal(ipam.dataStore.GetPodInfos()) | ||
if err != nil { | ||
log.Error("Failed to introspect pod data: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
marshal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// express or implied. See the License for the specific language governing | ||
// permissions and limitations under the License. | ||
|
||
package utils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a TODO to extract this to library (it's copied from ecs agent, yes?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ipamd/introspect.go
Outdated
@@ -122,7 +122,7 @@ func podV1RequestHandler(ipam *IPAMContext) func(http.ResponseWriter, *http.Requ | |||
return func(w http.ResponseWriter, r *http.Request) { | |||
responseJSON, err := json.Marshal(ipam.dataStore.GetPodInfos()) | |||
if err != nil { | |||
log.Error("Failed to introspect pod data: %v", err) | |||
log.Error("Failed to marshal POD data: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pod" is not an initialism
ipamd/datastore/data_store.go
Outdated
@@ -270,7 +271,7 @@ func (ds *DataStore) FreeENI() (string, error) { | |||
deletableENI := ds.getDeletableENI() | |||
if deletableENI == nil { | |||
log.Debugf("FreeENI: no deletable ENI") | |||
return "", errors.New("free eni: none of enis can be deleted at this time") | |||
return "", errors.New("free ENI: none of ENI can be deleted at this time") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"no ENI can be deleted "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ipamd/datastore/data_store.go
Outdated
@@ -319,12 +320,12 @@ func (ds *DataStore) UnAssignPodIPv4Address(k8sPod *k8sapi.K8SPodInfo) (string, | |||
} | |||
} | |||
|
|||
log.Warnf("UnassignIPv4Address: Failed to find pod %s namespace %s Container %s using ip %s", | |||
log.Warnf("UnassignIPv4Address: Failed to find pod %s namespace %s Container %s using IP %s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lowercase Container
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ipamd/datastore/data_store.go
Outdated
k8sPod.IP, k8sPod.Name, k8sPod.Namespace, k8sPod.Container) | ||
return ipAddr.IP, ipAddr.DeviceNumber, nil | ||
} | ||
//TODO handle this bug assert?, may need to add a counter here, if counter is too high, need to mark node as unhealthy... | ||
// this is a bug that the caller invoke multiple times to assign(PodName/NameSpace -> a different IPaddress). | ||
log.Errorf("AssignPodIPv4Address: current ip %s is changed to ip %s for POD(name %s, namespace %s, container %s)", | ||
log.Errorf("AssignPodIPv4Address: current IP %s is changed to IP %s for POD(name %s, namespace %s, container %s)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lowercase POD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ipamd/datastore/data_store.go
Outdated
|
||
// ENIIPPool contains ENI/IP Pool information | ||
// ENIIPPool contains ENI/IP Pool information. The capitalized variables are used for introspection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/The capitalized variables are used for introspection/Exported fields will be Marshaled for introspection/
ipamd/datastore/data_store.go
Outdated
} | ||
|
||
// AddressInfo contains inforation about an IP | ||
// AddressInfo contains inforation about an IP, The Capitalized variables are used for introspection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exported fields will be Marshaled for introspection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
scripts/aws-cni-support.sh
Outdated
echo "=============================================" >> ${LOG_DIR}/${ROUTE_OUTPUT} | ||
echo "ip route show table $i" >> $LOG_DIR/$ROUTE_OUTPUT | ||
ip route show table $i >> $LOG_DIR/$ROUTE_OUTPUT | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why {0..32} instead of ip route show table all
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will use ip route show table all
scripts/aws-cni-support.sh
Outdated
LOG_DIR="/var/log/aws-routed-eni" | ||
|
||
# collecting L-IPAMD introspection data | ||
curl http://localhost:51678/v1/enis | python -m json.tool > ${LOG_DIR}/eni.output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the python module json.tool always going to be available or does this rely on some base image or distribution having made sure it's present already? Is this just for pretty-printing the output? Perhaps l-ipamd could pretty print by default using MarshalIndent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will remove python pretty print for now
scripts/aws-cni-support.sh
Outdated
curl http://localhost:51678/v1/pods | python -m json.tool > ${LOG_DIR}/pod.output | ||
|
||
# collecting kubeleet introspection data | ||
curl http://localhost:10255/pods | python -m json.tool > ${LOG_DIR}/kubelet.output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raghav @rtripat asked for "kubelet read-only port default is 10255 but it can be changed by users. take it as an input to the aws-cni-support.sh, if not provided default to 10255"
8302c7c
to
c80fec1
Compare
|
||
const ( | ||
// IntrospectionPort is the port for ipamd introspection | ||
IntrospectionPort = 51678 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you pick a different port than this? 51678 is used by the ECS agent and picking this port will prevent this plugin and ECS from ever being used together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samuelkarp thanks and have opened #19 for this.
# This is the 1st commit message: Add VlanId in the cmdAdd Result struct This VlanId will appear in the prevResult during cmdDel request Test prevResult contents CleanUp Pod Network using vlanId from prevResult in CNI itself No need to call ipamd Log formatting changes Added hostNetworking Setup test for pods using security groups revoke unnecessary test agent image changes Revoke unnecessary changes remove focussed test set replica count to total number of branch interface Fix replica count # This is the commit message aws#2: Updated cleanUpPodENI method # This is the commit message aws#3: Skip processing Delete request if prevResult is nil Add Logging vlanId to ipamd # This is the commit message aws#4: Add support to test with containerd nodegroup in pod-eni test # This is the commit message aws#5: Add check for empty Netns() in cni # This is the commit message aws#6: Manifests and Readme updates (aws#1732) * Manifests and Readme updates * update manifest.jsonnet # This is the commit message aws#7: Readme updates (aws#1735) # This is the commit message aws#8: Updates to troubleshooting doc (aws#1737) * Updates to troubleshooting doc * updates to troubleshooting doc # This is the commit message aws#9: imdsv2 changes (aws#1743) # This is the commit message aws#10: fix flaky canary test (aws#1742) # This is the commit message aws#11: add CODEOWNERS (aws#1747)
Add following 2 introspection commands:
Also add aws-cni-support.sh which will collect aws-vpc-cni-k8s cni related debugging information into a support bundle