Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override _id of replicaSet in automation config (complementary to additionalMongodConfig replication.replSetName) #1650

Open
1st8 opened this issue Jan 6, 2025 · 1 comment

Comments

@1st8
Copy link

1st8 commented Jan 6, 2025

What did you do to encounter the bug?

Steps to reproduce the behavior:

  1. Have a legacy (deployed without the operator) MongoDB cluster with--replSet=rs0 in a StatefulSet named mongodb
  2. Migrate the cluster nodes to the operator using:
metadata:
  name: mongodb
spec:
  # ...
  additionalMongodConfig:
    replication.replSetName: rs0

What did you expect?

I expected the agent to receive rs0 as _id of the replicaSet when additionalMongodConfig replication.replSetName is used.

Alternatively there could be an override for the replicaSetName under spec directly, so it is used both for the replicaSets._id (agent) and replication.replSetName (mongod).

What happened instead?

It receives mongodb instead.

Screenshots

Screenshot 2025-01-06 at 16 17 10

Screenshot 2025-01-06 at 15 39 50

Additional context

We are looking to migrate 100+ clusters to the operator and this is the last piece in the puzzle.

I confirmed this by stopping the operator and manually changing the _id to rs0 in secret mongodb-config and then seeing the agent become ready.

After starting the operator again, it then undos the changes to the secret, of course.

I had a thorough look through the relevant sources and didn't find a way to fix this with the current implementation.

The automation config builder name is set to the mdb.Name here:

and then Id is set to b.name here:

b.name is also used to set replication.replSetName here:

so this might suggest that the override solution would be preferable.

Did I miss something?

If I could contribute a fix for this, what should I look out for in my implementation to make it acceptable?


  • yaml definitions of your MongoDB Deployment(s):
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
  name: mongodb
spec:
  members: 3
  type: ReplicaSet
  version: "5.0.28"
  # ...
  additionalMongodConfig:
    replication.replSetName: rs0
  • The agent clusterconfig of the faulty members:
{"version":2,"processes":[{"name":"mongodb-0","disabled":false,"hostname":"mongodb-0.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net","args2_6":{"net":{"port":27017},"replication":{"replSetName":"rs0"},"security":{"transitionToAuth":true},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"cacheSizeGB":"0.5","journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"5.0","processType":"mongod","version":"5.0.28","authSchemaVersion":5},{"name":"mongodb-1","disabled":false,"hostname":"mongodb-1.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net","args2_6":{"net":{"port":27017},"replication":{"replSetName":"rs0"},"security":{"transitionToAuth":true},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"cacheSizeGB":"0.5","journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"5.0","processType":"mongod","version":"5.0.28","authSchemaVersion":5},{"name":"mongodb-2","disabled":false,"hostname":"mongodb-2.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net","args2_6":{"net":{"port":27017},"replication":{"replSetName":"rs0"},"security":{"transitionToAuth":true},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"cacheSizeGB":"0.5","journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"5.0","processType":"mongod","version":"5.0.28","authSchemaVersion":5}],"replicaSets":[{"_id":"mongodb","members":[{"_id":0,"host":"mongodb-0","arbiterOnly":false,"votes":1,"priority":1},{"_id":1,"host":"mongodb-1","arbiterOnly":false,"votes":1,"priority":1},{"_id":2,"host":"mongodb-2","arbiterOnly":false,"votes":1,"priority":1}],"protocolVersion":"1","numberArbiters":0}],"auth":{"usersWanted":[{"mechanisms":[],"roles":[{"role":"clusterAdmin","db":"admin"},{"role":"userAdminAnyDatabase","db":"admin"}],"user":"tixxt","db":"admin","authenticationRestrictions":[],"scramSha256Creds":{"iterationCount":15000,"salt":"ntalm9D0Jj4Dt0zYRfBlihEwyc2U/FUfkBsMLQ==","serverKey":"hHycYJGOBRUWuJ/B92ZgY+6zDeyaLA7sVa/ILPADmOw=","storedKey":"pcWfo77fY5yIzGSB1YNsBKvqEHlWaaMCPE1OFc0iZic="},"scramSha1Creds":{"iterationCount":10000,"salt":"a0WKosyWCXTivtM1ywCw+Q==","serverKey":"W04zj/xVioHvjRNjaqH8Di02I8k=","storedKey":"0IjhW1a5s2flWcDoq/TUsWyO6Z4="}}],"disabled":false,"authoritativeSet":false,"autoAuthMechanisms":["SCRAM-SHA-256"],"autoAuthMechanism":"SCRAM-SHA-256","deploymentAuthMechanisms":["SCRAM-SHA-256"],"autoUser":"mms-automation","key":"qENSGM4WjMUAtkLPGBgSos3NqgTeRde9HMir6DZ+mil8M259JlcKJcEP33pIDOQXHrGommQrj7CzAnaRmFl6FvfXQgW+2dqo6yAtt3lIUFBr9fUP6vfqNLvBPD2QQL4s+LG3vwwud0G8Pnvr6ZksJUxHqdljXd8SYnootmEs6UtHyC8G1/8m0EHwNu+ez2Wg7+3naenpSxIxGaLR0ZWnTiCuejiFU3M21m3jJ91pBXSWmi6vzKEKQlAkyhy9Ur4z2X29U0wkpVcAwSHbhvNksWpaBo13ZuHRQSyaobKhyX3MlEL8pwyQAqSlrjEAP7oqCZvF5l3Cxkjp3ekmEG/w15q5gjZ/fhKad+2blJLw72F/dwMWQrI9gBPoXteaQ9qFh06E9a+qZrBFEULSuLKqXp2C8/p4fzsBo0Dp5Eg2Xbg0/I0wYKgu4uYVziKrurqJ72Ko7osK4FfPHqBDk9b5nq2DZ8IihIBgC8NNCNEDypbZi24nkvqgBUEmTb5I4ZkZo5W3ZK7kSZL1L0QnVY7xxL5suCPFyEsCzV2bZ+NNUDRMIJ/H9edblZCu4MLd0Mr/RHIG4hfCkFtAxO3bOEMpZR1gsTheS9LMMgK+mTZ5vDXKY8iJhOKp9CGdZt2dFqcNVbiW+3LUJa00ar2S1YllozOBzzU=","keyfile":"/var/lib/mongodb-mms-automation/authentication/keyfile","keyfileWindows":"%SystemDrive%\\MMSAutomation\\versions\\keyfile","autoPwd":"sG3uBaAQisy09xjuz8dG"},"tls":{"CAFilePath":"","clientCertificateMode":"OPTIONAL"},"mongoDbVersions":[{"name":"5.0.28","builds":[{"platform":"linux","url":"","gitVersion":"","architecture":"amd64","flavor":"rhel","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linux","url":"","gitVersion":"","architecture":"amd64","flavor":"ubuntu","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linux","url":"","gitVersion":"","architecture":"aarch64","flavor":"ubuntu","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linux","url":"","gitVersion":"","architecture":"aarch64","flavor":"rhel","minOsVersion":"","maxOsVersion":"","modules":[]}]}],"backupVersions":[],"monitoringVersions":[],"options":{"downloadBase":"/var/lib/mongodb-mms-automation"}}
  • The agent health status of the faulty members:
{"statuses":{"mongodb-0":{"IsInGoalState":false,"LastMongoUpTime":1736175160,"ExpectedToBeUp":true,"ReplicationStatus":1}},"mmsStatus":{"mongodb-0":{"name":"mongodb-0","lastGoalVersionAchieved":1,"plans":[{"automationConfigVersion":1,"started":"2025-01-06T13:24:00.086578872Z","completed":null,"moves":[{"move":"Start","moveDoc":"Start the process","steps":[{"step":"StartFresh","stepDoc":"Start a mongo instance  (start fresh)","isWaitStep":false,"started":"2025-01-06T13:24:00.08660494Z","completed":"2025-01-06T13:24:09.865099778Z","result":"success"}]},{"move":"WaitAllRsMembersUp","moveDoc":"Wait until all members of this process' repl set are up","steps":[{"step":"WaitAllRsMembersUp","stepDoc":"Wait until all members of this process' repl set are up","isWaitStep":true,"started":"2025-01-06T13:24:09.865192101Z","completed":null,"result":"wait"}]},{"move":"RsInit","moveDoc":"Initialize a replica set including the current MongoDB process","steps":[{"step":"RsInit","stepDoc":"Initialize a replica set","isWaitStep":false,"started":null,"completed":null,"result":""}]},{"move":"WaitFeatureCompatibilityVersionCorrect","moveDoc":"Wait for featureCompatibilityVersion to be right","steps":[{"step":"WaitFeatureCompatibilityVersionCorrect","stepDoc":"Wait for featureCompatibilityVersion to be right","isWaitStep":true,"started":null,"completed":null,"result":""}]}]},{"automationConfigVersion":1,"started":"2025-01-06T13:35:35.257483563Z","completed":"2025-01-06T13:35:39.182499358Z","moves":[{"move":"EnsureAutomationCredentials","moveDoc":"Ensure the automation user exists","steps":[{"step":"EnsureAutomationCredentials","stepDoc":"Ensure the automation user exists","isWaitStep":false,"started":"2025-01-06T13:35:35.257505082Z","completed":"2025-01-06T13:35:39.022009499Z","result":"success"}]},{"move":"AdjustUsers","moveDoc":"Adjust Users","steps":[{"step":"AdjustUsers","stepDoc":"Adjust Users","isWaitStep":false,"started":"2025-01-06T13:35:39.022147572Z","completed":"2025-01-06T13:35:39.182354797Z","result":"success"}]}]}],"errorCode":104,"errorString":"\u003cmongodb-0\u003e [14:51:42.024] Failed to find a plan!","waitDetails":{"WaitAllRsMembersUp":"[]","WaitAuthSchemaCorrect":"auth schema will be updated by the primary","WaitCanStartFresh":"process not up","WaitCannotBecomePrimary":"Wait until the process is reconfigured with priority=0 by a different process","WaitDefaultRWConcernCorrect":"waiting for the primary to update defaultRWConcern","WaitForResyncPrimaryManualInterventionStep":"A resync was requested on a primary. This requires manual intervention","WaitHealthyMajority":"[]","WaitMultipleHealthyNonArbiters":"[]","WaitNecessaryRsMembersUpForReconfig":"[]","WaitPrimary":"This process is expected to be the primary member. Check that the replica set state allows a primary to be elected","WaitProcessUp":"The process is running, but not yet responding to agent calls","WaitResetPlacementHistory":"config servers  haven't seen the marker"}}}
  • The verbose agent logs of the faulty members:
[2025-01-06T14:53:42.436+0000] [.warn] [src/director/director.go:computePlan:297] <mongodb-0> [14:53:42.436] ... No plan could be found - not in goal state because of:
[All the following are false:
    ['desiredState.ReplSetConf' != <nil> ('desiredState.ReplSetConf' = ReplSetConfig{id=mongodb,version=0,commitmentStatus=false,configsvr=false,protocolVersion=1,forceProtocolVersion=false,writeConcernMajorityJournalDefault=,members={id:0,HostPort:mongodb-0.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net:27017,ArbiterOnly:falsePriority:1,Hidden:false,SecondaryDelaySecs:0,Votes:1,Tags:map[]},{id:1,HostPort:mongodb-1.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net:27017,ArbiterOnly:falsePriority:1,Hidden:false,SecondaryDelaySecs:0,Votes:1,Tags:map[]},{id:2,HostPort:mongodb-2.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net:27017,ArbiterOnly:falsePriority:1,Hidden:false,SecondaryDelaySecs:0,Votes:1,Tags:map[]},settings=map[]})]
    ['currentState.ReplSetConf.Id' != 'desiredState.ReplSetConf.Id' : (rs0) vs. (mongodb)]

Also seen previously:

(InvalidReplicaSetConfig) Rejecting initiate with a set name that differs from command line set name, initiate set name: mongodb, command line set name: rs0
@1st8
Copy link
Author

1st8 commented Jan 6, 2025

What do you think about adding a new Id string in OverrideReplicaSet?

type OverrideReplicaSet struct {
// +kubebuilder:validation:Type=object
// +kubebuilder:pruning:PreserveUnknownFields
Settings MapWrapper `json:"settings,omitempty"`
}

So that this would work:

spec:
  automationConfig:
    replicaSet:
      id: rs0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant