Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MORAY-418 moray should queue requests when new PG connections fail MORAY-339 moray needs to use exponential backoff when connection to postgresql MORAY-294 suspect moray doesn't honor its connection limit properly MORAY-427 moray queuing needs work #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

joyent-automation
Copy link
Contributor

MORAY-418 moray should queue requests when new PG connections fail
MORAY-339 moray needs to use exponential backoff when connection to postgresql
MORAY-294 suspect moray doesn't honor its connection limit properly
MORAY-427 moray queuing needs work

This PR was migrated-from-gerrit, https://cr.joyent.us/#/c/3550/.
The raw archive of this CR is here.
See MANTA-4594 for info on Joyent Eng's migration from Gerrit.

CR discussion

@chudley commented at 2018-04-06T13:49:07

Patch Set 1:

(26 comments)

A lot of this might come from my inexperience with cueball, so please bear with me a little!

I should also note that I tested this image as part of MORAY-464 and it resolved my issue in that case.

@timkordas commented at 2018-04-06T18:29:28

Patch Set 1:

(19 comments)

Patch Set 1 code comments
lib/manatee.js#47 @chudley

I'm not sure this does the same thing as before, but I haven't been able to dig into why yet. Previously, if a connection had been destroyed and not yet recreated, all of our counters would be 0. With this change in place, I see the "available" counter at 1 in this case.

I'd be happy to work through the example I have, if it helps!

lib/manatee.js#47 @timkordas

I would love to see the example you have.

lib/manatee.js#47 @chudley

Sure thing. I'll see how's best to get this written down.

lib/manatee.js#47 @chudley

I'm having a hard time reproducing this one, unfortunately.

In any case, I think the counter that we've got here from cueball will be fine. Even if the connection can't be made, I guess the connection is still technically open. There's also a good chance that between the connection attempts that cueball makes that there's 0 connections, but I haven't observed that.

Ultimately, so long as they're not showing as "available" (which they don't), then I'd say this is fine.

lib/manatee.js#257 @chudley

This is also explained in the comment below, so it doesn't look like we need this here.

lib/manatee.js#257 @timkordas

Done

lib/manatee.js#259 @chudley

I don't think this log is required any more, at least if we're not calling "previous.close()".

lib/manatee.js#259 @timkordas

Done

lib/manatee.js#277 @chudley

I don't think it's worth keeping the old MORAY-225 comment here. I also tend to work on the advice I got from JoshC previously that is to not make use of ticket IDs in comments. Now that we have commit message formalities in linking to tickets, that tends to show the history well enough.

lib/manatee.js#277 @timkordas

thanks for the feedback, next changeset includes a rewrite of this comment.

lib/manatee.js#278 @chudley

Now that we're using cueball, do we need to do this updateBackend step if the primary hasn't actually changed, or will cueball attempt to re-create this connection for us?

Basically, if self._database.url === topology[0].url, do we need to updateBackend at all?

lib/manatee.js#278 @timkordas

the update does emit some messages to cueball.

I think I'd need to do some renewed failover testing before changing this as you suggest. I'll try to get to that.

lib/manatee.js#283 @chudley

Using the old "previous" variable name here made this a little confusing for me to read. What I think this line is doing is just confirming that we already have a database and resolver object created, and if so re-using them. Does that make sense?

I think "previous" made sense before (at least on L256) as it read well in the context of always closing the old connection. Can we just use self._database here instead?

lib/manatee.js#283 @timkordas

yep. agree that is is confusing. thanks!

lib/pg.js#96 @chudley

Why is PGClient now an EventEmitter? Is it because of the close event later?

lib/pg.js#96 @timkordas

exactly. it is required by the cueball API (for connect and close):

https://joyent.github.io/node-cueball/#code_close_code_required

lib/pg.js#154 @chudley

In node-pooling, we set ok to true if ok === undefined after pgAssert (https://github.com/mcavage/node-pooling/blob/4b41e369d2292129fad26a7695027cb1fa21eaa5/lib/pool.js#L462). This confuses me, but I think we can get undefined back from pgAssert if we blow one of its assertions, but only because we're in a try/catch.

What I'm trying to get at is: are we sure that ok will always either be true or false after this try/catch? Why was the ok === undefined check required previously, and now not needed?

lib/pg.js#154 @timkordas

I'm skeptical that ok can ever be undefined; and if it is that means the pgAssert() did something strange ? but the node-pooling code assumes that "something strange" is OK ?

if I add the check for undefined, should I assume that undefined means that we consider it true ? I'm not honestly sure what to do here.

My understanding was that either execution would proceed through the assert (where the conditional at the end would set it appropriately), or blow up and set ok in the catch-block.

lib/pg.js#154 @chudley

Yes, this was really confusing in node-pooling. I think we can assign ok to undefined if pgAssert blows an assertion (at least that's how it looks if I do it in a repl), but you're right: why would this be "ok"?

From what I can tell what we have now is fine, but I just wanted to bring it up in case you were able to decipher the node-pooling way of doing things.

lib/pg.js#161 @chudley

Is there a reason for this temporary re-assignment? We seem to do the following patter elsewhere:

this.handle.close();
this.handle = null;

Would that work here, too?

lib/pg.js#161 @timkordas

I think I habitually do it this way in case of weird side-effects in the .close() call (to avoid calling the close() twice?)

anyhow, will fix.

lib/pg.js#186 @chudley

A+ best block comment style.

lib/pg.js#186 @timkordas

Done

lib/pg.js#216 @chudley

Nit: Any particular reason for moving this? It looks the same, but it's nice to have gerrit confirm that there are in fact no changes (at least so long as the move isn't required).

lib/pg.js#216 @timkordas

I think maybe this got moved in one of several rebases. will move it back. :-)

lib/pg.js#421 @chudley

I think I saw this come up in the previous CR, and I understand that PGClient.close() was only called from pgDestroy (which is what we passed to node-pooling).

It seems like other pieces (_database, _resolver) use the "close" naming convention. It's also a little surprising that calling the "destroy" method would see a "close" event get emitted. I'm still working my way through this, but is there a specific reason for the change?

lib/pg.js#421 @timkordas

this is an attempt to conform to the cueball API (it requires PGClient to have a "destroy" method; which emits "close").

lib/pg.js#430 @chudley

Is cueball watching for this event, hence PGClient now being an EventEmitter?

lib/pg.js#430 @timkordas

exactly.

lib/pg.js#474 @chudley

I think either jsstyle or jslint would usually complain that this isn't parenthesised.

lib/pg.js#474 @timkordas

Done

lib/pg.js#477 @chudley

Nit: This comment would have been helpful before list() at L469. Perhaps Instead of the resolver comment we have on lib/manatee.js, we should document the single backend thing on the PGResolver declaration above?

lib/pg.js#477 @timkordas

Done

lib/pg.js#486 @chudley

This is interesting. Is this a normal thing to do in cueball? Given that our postgres instances are (or should be) unique anyway, do we need to create a UUID here, or can we use their connection string as the key?

lib/pg.js#486 @timkordas

I think I did it this way to distinguish all updates from previous updates. I'm not sure that's actually useful or not. Updates are only supposed to happen during topology changes.

lib/pg.js#486 @chudley

Good point, I can agree with that. I wonder if we'd lose any information somehow by not encoding the PG URL into the key, but I don't see this as major. Perhaps worth pinging Alex to see if he has any pointers on keying backends, and if it's useful at all to have them be useful to humans?

lib/pg.js#498 @chudley

Nit: I think this reads better when we have the variable we're asserting first, and the vale we expect second, for example:

assert.equal(_pg._moray_txn, false, 'xxx');

I see that this has come from the release() method, but I'm not sure of the implications of moving it. Would you be able to help me understand the move?

lib/pg.js#498 @timkordas

agreed!

lib/pg.js#511 @chudley

I think this sentence could be reworded

... interface expects does an ...

lib/pg.js#511 @timkordas

"expects does" makes no sense. thanks.

lib/pg.js#519 @chudley

It looks like previously an err on the callback was expected, but here we want an explicit call to close or release.

This looks like we could call handle.close() twice; once here and once on end. Do you know if that's the case? If we see an error event, do we also see the end event? Should we set ok to false in this error event handler instead?

lib/pg.js#519 @timkordas

this looks sketchy. setting ok to false seems to be the intent.

lib/pg.js#519 @timkordas

actually, it looks like if we get an error on the req, we don't necessarily get end. so that is why this is like this, I think.

lib/pg.js#519 @chudley

That makes sense. I looked into this recently with some of my MANATEE-380 work, and I wasn't sure if it was actually a problem to call close() twice on the same client (I don't think it is, but I couldn't convince myself that it would get called twice in the first place).

I wonder if some logging would help if this were ever a problem in the future? A lot of our components do these "health check" log messages that can be a pain, but I think if they're trace of debug level with a good way of identifying them then perhaps that would be good. Not major.

lib/pg.js#519 @timkordas

Adding some trace-level stuff here seems like a good idea.

lib/pg.js#547 @chudley

Previously we had this instantiation after the base node-postgres client emitted its "connect" event. Is it OK to change this, as we've done here?

lib/pg.js#547 @timkordas

I'm trying to use the pgc._handleClientError() for all of the errors, so I need pgc to be bound.

the construction-path for connections is the largest piece of code that's changed for cueball.

the constructor used by cueball is supposed to return the PGClient instance. it may return without being connected, this is a change in the API from how things work with node-pooling.

I can try to clean up the error path, I think.

lib/pg.js#547 @timkordas

I think that to conform to the cueball API:
https://joyent.github.io/node-cueball/#code_constructor_backend_code

we must return an object obeying the protocol, so we need a PGClient no matter what. I think making this instance here makes sense -- once we're called we've got to have one.

lib/pg.js#588 @chudley

Could you explain why we pass this function here?

lib/pg.js#588 @timkordas

I'm not sure I follow. I replaced client.removeAllListeners('error') with something more specific. I thought this was correct.

I feel like I must be missing something.

lib/pg.js#588 @chudley

Perhaps I'm missing something, too.

My understanding here is that we've successfully connected so are removing our listeners (I understand that PGClient listens instead). When we remove these listeners it seems that now we're passing the onError function, which would get called on completion of the listener removal and mean we log the message "unable to create PG client", but that's not my understanding of what's happened.

lib/pg.js#588 @timkordas

the second arg to removeListener() isn't a callback afaik, it is the listener to remove.

From the docs "Removes the specified listener from the listener array for the event named eventName."

really, I think that removeAllListeners(...) would be fine, but I was trying to be more specific.

But that said, I think that removeAllListeners() is the pattern used elsewhere in this code, so I'll change it to that.

lib/pg.js#588 @chudley

Sorry about this, you're right! This is indeed the listener to remove, not the callback that will be fired when it is removed.

Being specific here seems like a good idea. It's not a pattern I've seen much in all honesty, but I have no preference either way.

lib/pg.js#699 @chudley

Could we perhaps group these dummy values? Also, only defaultPort seems to be the only one commented as a dummy value, but domain looks to be dummy, too.

Maybe we could call the dummy values out explicitly in the comment above or something?

lib/pg.js#699 @timkordas

they are grouped, if I understand you. But you wouldn't know that since they aren't commented as dummy. :-)

lib/pg.js#772 @chudley

Could we name this "handle"?

lib/pg.js#772 @timkordas

Done

lib/pg.js#779 @chudley

Looks like we're missing a ')'.

lib/pg.js#779 @timkordas

Done

lib/pg.js#807 @chudley

Is this required? I can't find the consumer of this, and it looks like getPoolStats below reaches into this.pool anyway.

lib/standalone.js#198 @chudley

What is the difference between "resolver.stop()" and "resolver.stopResolver()" (from lib/manatee.js)?

lib/standalone.js#198 @timkordas

this appears to be a typo. The function is name stopResolver() but the method is named stop().

Thanks!

@timkordas commented at 2018-04-06T21:22:18

Patch Set 2:

(5 comments)

@timkordas commented at 2018-04-06T21:33:57

Patch Set 2:

(3 comments)

@chudley commented at 2018-04-09T12:26:02

Patch Set 3:

(6 comments)

@timkordas commented at 2018-04-09T18:36:06

Patch Set 3:

(2 comments)

Patch Set 3 code comments
lib/manatee.js#260 @chudley

because because ...

of the wonderful things he does?

(sorry)

@chudley commented at 2018-04-11T08:43:21

Patch Set 5: Code-Review+1

(2 comments)

Thanks for putting this together! I think it would be a good idea to have Cody back on this change request (I can see he had some comments on the old one), perhaps for both CR and IA.

@chudley commented at 2019-03-21T08:46:03

Removed the following votes:

…RAY-339 moray needs to use exponential backoff when connection to postgresql MORAY-294 suspect moray doesn't honor its connection limit properly MORAY-427 moray queuing needs work
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant