Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: k0s runtime remain work #1686

Merged
merged 5 commits into from
Sep 15, 2022

Conversation

starComingup
Copy link
Member

Describe what this PR does / why we need it

After k0s init step, there are update/join/delete/reset interface which need to implement, this PR finish all k0s cluster install action.

Does this pull request fix one issue?

NONE, but a step for #1399

@github-actions github-actions bot added the test label Sep 5, 2022
RemoveK0sBin = "rm -rf /usr/bin/k0s"
RemoteRemoveEtcHost = "sed -i \"/%s/d\" /etc/hosts"
RemoteRemoveRegistryCerts = "rm -rf " + DockerCertDir + "/%s*"
KubeDeleteNode = "kubectl delete node %s"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather put those command at the caller place. This will break the readability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! next commit will fix it!

for _, master := range masters {
master := master
eg.Go(func() error {
master := master
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is no need to declare the temp var again.

@codecov-commenter
Copy link

codecov-commenter commented Sep 5, 2022

Codecov Report

Base: 18.41% // Head: 18.41% // No change to project coverage 👍

Coverage data is based on head (a487276) compared to base (7f76157).
Patch has no changes to coverable lines.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1686   +/-   ##
=======================================
  Coverage   18.41%   18.41%           
=======================================
  Files          66       66           
  Lines        5354     5354           
=======================================
  Hits          986      986           
  Misses       4242     4242           
  Partials      126      126           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

RemoveKubeConfig = "rm -rf /usr/bin/kube* && rm -rf ~/.kube/"
RemoveK0sBin = "rm -rf /usr/bin/k0s"
RemoteRemoveEtcHost = "sed -i \"/%s/d\" /etc/hosts"
RemoteRemoveRegistryCerts = "rm -rf " + DockerCertDir + "/%s*"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use format in the const string value. It would make code quite complicated.

RemoveK0sBin = "rm -rf /usr/bin/k0s"
RemoteRemoveEtcHost = "sed -i \"/%s/d\" /etc/hosts"
RemoteRemoveRegistryCerts = "rm -rf " + DockerCertDir + "/%s*"
KubeDeleteNode = "kubectl delete node %s"
Copy link
Member

@allencloud allencloud Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should use an API calling via k8s apiclient torwards apiserver, rather than using a shell command.

master := master
logrus.Infof("Start to delete master %s", master)
if err := k.deleteMaster(master); err != nil {
logrus.Errorf("failed to delete master %s: %v", master, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should return err here, otherwise the eg.Go() could not catch the detailed error, and the return eg.Wait() would always return nil, which is definitely unreasonable.

func (k *Runtime) deleteMaster(master net.IP) error {
ssh, err := k.getHostSSHClient(master)
if err != nil {
return fmt.Errorf("failed to delete master: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your error log message should be optimized. When logging error message, you should set up a rule for yourself, where to locate the wrap message. In this case, the wrap message is failed to delete master:. And in Line 49, you have code logrus.Errorf("failed to delete master %s: %v", master, err). Then if an error is returned back, then you error message would be failed to delete master 1.1.1.1: failed to delete master: xxxxxx. It is duplicated and definitely inproper.

I wish that you could pay more attention on every part of the code. Only by patience and carefulness, could quality of software be achieved.

return eg.Wait()
}

func (k *Runtime) deleteMaster(master net.IP) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a detailed deep implementation of delete a master node. I think we should explain clearly what is the STEPS to delete a master by comments. If we understand the steps and then the code review could be quite easy. Actually we should review the design firstly, and code secondly. No design there, and we can hardly do the code review. Do you think so ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! when I read kubernetes runtime module, I have the same feel. which files to delete? and what logic to follow. I so sorry to made this, fix this ASAP.

}

func (k *Runtime) isHostName(master, host net.IP) (string, error) {
hostString, err := k.CmdToString(master, "kubectl get nodes | grep -v NAME | awk '{print $1}'", ",")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use k8s apiClient to get the information you want. Otherwise it is quite strange to add shell action in golang code. If KubeCtl released a higher version, and the output format changes, the code here will immediately meet the compatibility issue. What a bad design. Right?

for _, h := range hosts {
if strings.TrimSpace(h) == "" {
continue
} else {
Copy link
Member

@allencloud allencloud Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is a continue in if block, you can remove the else to remove indent code.

if strings.TrimSpace(h) == "" {
	continue
}
hh := strings.ToLower(h)
fromH := strings.ToLower(hostName)
if hh == fromH {
	name = h
	break
}

@allencloud
Copy link
Member

I just reviewed the delete_master.go file, and attached several feedbacks. I think you have the same issue in other files, like delete_node.go, join_node.go and so on. Please update the original code all all files carefully. Then you could comment this pr to seek my further review. Thanks. @starComingup 🌹 🐯 🦁

STEP5: remove bin file such as: kubectl, and remove .kube directory
STEP6: remove k0s bin file
STEP7: delete node though k8s client
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if one of the cmds failed, will sealer throw the error for that failed cmd?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the error will be thrown by ssh.Async and captured by the caller.

}

if len(masterIPs) > 0 {
hostname, err := k.isHostName(master)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess "isHostName" should be "getHostName"?

masterIPs := []net.IP{}
for _, ip := range k.cluster.GetMasterIPList() {
if !ip.Equal(master) {
masterIPs = append(masterIPs, ip)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

masterIPs is not used in the below part.
IMO, length of masterIPs plays the role.
So I think we can do like:

masterExist := len(k.cluster.GetMasterIPList()) > 1
if !masterExist {
  return
}

And do we have to judge the condition len(k.cluster.GetMasterIPList()) == 1 and k.cluster.GetMasterIPList()[0].Equal(master) == true. Is this case possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read this code block and I think It did a thing that judge the rest master nodes need greater than one, you code is more clearly and simplify

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for len(k.cluster.GetMasterIPList()) == 1) ,i think it is no need to judge it. Code logic will protect the cluster have a master at least.


var name string
for _, h := range hosts {
if strings.TrimSpace(h) == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do TrimSpace(h) here. I guess the result from TrimSpace(h) should pass down? hh := strings.ToLower(strings.TrimSpace(h))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think we can't pass down value after processed. I don't know how hostname special, but I know field got by k8s client would be it input.

v2 "github.com/sealerio/sealer/types/api/v2"
)

func RuntimeChoose(rootfs string, cluster *v2.Cluster, config *kubeadm.KubeadmConfig) (runtime.Interface, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@starComingup Here I have a point to discuss with you.

If runtime is a field of details processor instance, or belongs to a processor's function?

If runtime is a field of processor, then RuntimeChoose should be used in a processor's initialization, rather than set runtime in processor's function.

For example, I see

type ScaleProcessor struct {
    runtime runtime.Interface
    xxx
}

And I see

func (s *ScaleProcessor) PreProcess(cluster *v2.Cluster) error {
	runTime, err := RuntimeChoose(platform.DefaultMountClusterImageDir(cluster.Name), cluster, s.ClusterFile.GetKubeadmConfig())
	if err != nil {
		return fmt.Errorf("failed to init default runtime: %v", err)
	}
	s.Runtime = runTime

I feel quite strange. Why do you RuntimeChoose in ScaleProcessor and set ScaleProcessor's runtime filed, instead of set runtime type in NewScaleProcessor? Initialization in NewScaleProcessor could make code much more clear and more object-oriented.

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, please change the function name into ChooseRuntime or GetRuntime rather than RuntimeChoose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I have a point to discuss with you.

If runtime is a field of details processor instance, or belongs to a processor's function?

If runtime is a field of processor, then RuntimeChoose should be used in a processor's initialization, rather than set runtime in processor's function.

For example, I see

The old code set kubernetes.runtime here, so I just make this. But when I coded this, I also felt confused why origin logic set runtime in sub function. I will try to modify it!

return k0s.NewK0sRuntime(cluster)
// Todo case runtime.K3s:
default:
return kubernetes.NewDefaultRuntime(cluster, config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish to see more thinking details about the default case. From my point of view, if there is no metadata.ClusterRuntime in image rootfs details, we take it as k8s by default. Because in the original implementation of ClusterImage, there is no metadata.ClusterRuntime, and we have to make the new version of sealer compatible with the legacy ClusterImage format. For what is worth, I suggest to add a log to tell end-user that he is using a legacy version, and using new version of sealer and ClusterImage is encouraged.

Otherwise, I think you should add a line of explicit log at Line31 to show the ClusterRuntime info.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with add a cluster runtime info and hint end-user using new version ClusterImage.

return eg.Wait()
}

func (k *Runtime) resetMasters(nodes []net.IP) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use a eg, _ := errgroup.WithContext(context.Background()) in resetMasters as well, just like what you did in resetNodes? @starComingup

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do what you advice, then I will verify it.

@@ -38,48 +41,60 @@ type Runtime struct {
RegConfig *registry.Config
}

var ForceDelete bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not see any assignment to this variable. Please double check it.

Copy link
Member Author

@starComingup starComingup Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this in Line245 confirm force delete.

"k0s start",
}
var err error
for _, ip := range IPs {
Copy link
Member

@allencloud allencloud Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use a eg, _ := errgroup.WithContext(context.Background()) here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it can speed up too much, so I'll keep this logic for now.

return nil
}

func (k *Runtime) upgradeMasters(IPs []net.IP, binPath string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use a eg, _ := errgroup.WithContext(context.Background()) here?

Copy link
Member Author

@starComingup starComingup Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep this logic for now, If necessary, I will modify it.

@allencloud
Copy link
Member

Left some comment, and wait for @starComingup 's feedback and update before merging.

@starComingup
Copy link
Member Author

All comment and code updated! Waiting for test.

@allencloud
Copy link
Member

Thanks greatly for your patience during review. @starComingup I think your work could improve sealer quite a lot. Thanks again.

@allencloud allencloud merged commit d1a3e78 into sealerio:main Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants