-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout topology watch and return better error messages for missing topology / namespaces #926
Conversation
8e4e886
to
89a9567
Compare
src/dbnode/environment/config.go
Outdated
InstrumentOpts instrument.Options | ||
HashingSeed uint32 | ||
HostID string | ||
ResolutionTimeout time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to be b/w compatible when making changes to publicly configurable fields.
- Could you mark the old fields deprecated,
- Add the new field,
- Error if both new and old are set, and pick the set field otherwise
Rob had to do some of this in #890 -
m3/src/query/storage/local/config.go
Lines 90 to 117 in 721ac92
// StorageMetricsType is the namespace type. | |
// | |
// Deprecated: Use "Type" field when specifying config instead, it is | |
// invalid to use both. | |
StorageMetricsType storage.MetricsType `yaml:"storageMetricsType"` | |
} | |
func (c ClusterStaticNamespaceConfiguration) metricsType() (storage.MetricsType, error) { | |
result := storage.DefaultMetricsType | |
if c.Type != storage.DefaultMetricsType && c.StorageMetricsType != storage.DefaultMetricsType { | |
// Don't allow both to not be default | |
return result, errBothNamespaceTypeNewAndDeprecatedFieldsSet | |
} | |
if c.Type != storage.DefaultMetricsType { | |
// New field value set | |
return c.Type, nil | |
} | |
if c.StorageMetricsType != storage.DefaultMetricsType { | |
// Deprecated field value set | |
return c.StorageMetricsType, nil | |
} | |
// Both are default | |
return result, nil | |
} | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spoke offline, very unlikely anyone ever set these fields are they are undocumented in are YMLs so gonna make the backward incompatible change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spoke offlien again, gonna remove timeouts altogether
src/dbnode/storage/database.go
Outdated
@@ -164,7 +164,7 @@ func NewDatabase( | |||
nsInit := opts.NamespaceInitializer() | |||
nsReg, err := nsInit.Init() | |||
if err != nil { | |||
return nil, err | |||
return nil, fmt.Errorf("database: namespace initialized failed: %v. make sure a namespace has been set") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalize the new sentence. Also, include err
in the Errorf
.
Hey, do you want to make sure this supports the case if bringing up a node and using it's coordinator endpoint to create a placement? Before we set an indefinite timeout here we used to basically crash the node before you could create a placement, which was a fairly big issue. Does this change re-introduce that behavior? (Just wanting to get to the bottom of how to best cover both cases) |
@robskillington Yep, we settled on removing the timeout all together for that reason. As long as we put appropriate log messages before waiting in all the places, it should fail the same as it does today. Sound reasonable? |
89a9567
to
3090ae1
Compare
3090ae1
to
f504970
Compare
Codecov Report
@@ Coverage Diff @@
## master #926 +/- ##
==========================================
- Coverage 77.96% 77.91% -0.05%
==========================================
Files 410 410
Lines 34399 34358 -41
==========================================
- Hits 26818 26770 -48
- Misses 5734 5741 +7
Partials 1847 1847
Continue to review full report at Codecov.
|
@@ -185,19 +184,6 @@ func (t *dynamicTopology) MarkShardsAvailable( | |||
return ps.MarkShardsAvailable(instanceID, shardIDs...) | |||
} | |||
|
|||
func waitOnInit(w services.Watch, d time.Duration) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We finally removed this, noice!
InstrumentOpts instrument.Options | ||
HashingSeed uint32 | ||
HostID string | ||
NamespaceResolutionTimeout time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't just delete these (b/w compatibility and all), need to mark deprecated instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spoke offline and agreed its ok in this case because it was undocumented config and we're breaking the functionaltiy anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We weren't passing a timeout properly to the ServiceClient so:
would hang forever if there was no topology.
In addition, in the case where there is no topology or namespace the error message is cryptic (timeout) when it could be slightly more helpful