Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(clustering/compat): using array to store compatible checker functions #10152

Merged
merged 11 commits into from
Apr 6, 2023

Conversation

chronolaw
Copy link
Contributor

@chronolaw chronolaw commented Jan 21, 2023

Summary

Move compatible config-checking logic into a hash table, making code more readable.

Need special cherry-pick action to EE after #10400 and #10501 be cherry-picked.

Checklist

@chronolaw chronolaw changed the title refactor(clustering/compat): using compatible checker functions hash table refactor(clustering/compat): using hash table to store compatible checker functions Jan 21, 2023
@chronolaw chronolaw marked this pull request as ready for review January 21, 2023 07:03
@chronolaw chronolaw force-pushed the refactor/clustering_compat_check branch from d6d7909 to 1025f94 Compare January 22, 2023 06:04
has_update = true
end
end
for ver, fn in pairs(compatible_checkers) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that pairs() is sufficient for this. There is an expectation that the compatibility change functions are executed in descending order (3.2.0.0, then 3.1.0.0, then 3.0.0.0, etc).

Curious, have you taken a look at the compat code in koko? It's not without its quirks, but I think this would be a good pattern for us to follow--pretty big task to implement though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your opinion. I have changed the hash table to an array.

Do you think it is OK?

@chronolaw chronolaw changed the title refactor(clustering/compat): using hash table to store compatible checker functions refactor(clustering/compat): using array to store compatible checker functions Jan 30, 2023
@chronolaw
Copy link
Contributor Author

Need update to #10192

@chronolaw chronolaw marked this pull request as draft January 30, 2023 09:41
@oowl
Copy link
Member

oowl commented Jan 31, 2023

For the compact problem, I have another piece of advice. we have all of the kong schemas, and we know all of the kong version schemas, Could this compatibility logic be solved in a similar way to how we use code generation in bindgen? we use more reliable code(iterating all kong schema) to generate the Version compatibility code.

@chronolaw chronolaw force-pushed the refactor/clustering_compat_check branch from cf83987 to 79e6f1e Compare February 1, 2023 01:51
@pull-request-size pull-request-size bot added size/L and removed size/M labels Feb 1, 2023
@chronolaw chronolaw marked this pull request as ready for review February 1, 2023 03:13
@chronolaw chronolaw force-pushed the refactor/clustering_compat_check branch 2 times, most recently from ea0a4d5 to 56e3cee Compare February 28, 2023 01:53
@chronolaw chronolaw force-pushed the refactor/clustering_compat_check branch from 56e3cee to a96a9a3 Compare March 7, 2023 03:03
@chronolaw
Copy link
Contributor Author

chronolaw commented Mar 8, 2023

Need update to #10400

@chronolaw chronolaw added the version-compatibility Incompatibility with older data plane versions label Mar 8, 2023
@chronolaw chronolaw force-pushed the refactor/clustering_compat_check branch from aee3033 to 082e6d8 Compare March 16, 2023 05:56
@vm-001 vm-001 self-requested a review March 16, 2023 07:47
@liverpool8056 liverpool8056 self-requested a review March 16, 2023 07:52
Copy link
Contributor

@vm-001 vm-001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instance_name part looks OK to me. And also has a corresponding test to cover that.

@bungle bungle force-pushed the refactor/clustering_compat_check branch from 1582258 to 52a7a4a Compare March 16, 2023 10:44
@chronolaw
Copy link
Contributor Author

For the compact problem, I have another piece of advice. we have all of the kong schemas, and we know all of the kong version schemas, Could this compatibility logic be solved in a similar way to how we use code generation in bindgen? we use more reliable code(iterating all kong schema) to generate the Version compatibility code.

It is a good idea, perhaps we can do it in another PR in the future.

@chronolaw
Copy link
Contributor Author

chronolaw commented Mar 28, 2023

Echoing request-review request from Harry.

1/ From the maintainability perspective

  1. I don't see any sort function being called before iterating the COMPATIBILITY_CHECKERS, which means the program correctness is relying on the order of definition. Could we add some comments above the compatible_checkers var to describe that? Or could we introduce a sort mechanism?
  2. We may add more and more properties to each checker. I feel using the array index checker[n] is not friendly. What do you think of using map table to replace array table for each checker?
local compatible_checkers = {
  { 
    -- desc = "resolve plugin.instance_name",
    -- ... any other properties
    version = x, 
    fn = function() end,
   }
}

2/ From the style perspective

  1. checker is not an accuracy variable name here as it not only checks something but also rewrites something inside. resolver, or handler (or something else) may be a good candidate.

Overall, this is just my option and won't block the merge progress.

It is a very valuable suggestion, thank you @vm-001.

But Applying these changes may need more time and discussion, I think we could improve it in another PR later.

@liverpool8056
Copy link
Contributor

liverpool8056 commented Mar 28, 2023

Agree with @attenuation . And I think the current logic for compatibility is so complex, and we have to write the spec very carefully for ensuring the compatibility won't be broken when schemas are changed. It should have been a chance to simplify it, cause I think it's the DP's responsibility to ensure it can work even if it receives a unexpected config, most of the time it just receives some unknown fields or so. DP should have been capable to protect itself, at the same time, from the perspective of the users. DP should have been capable to ensure there is no unexpected behavior for the users as well.

@chronolaw chronolaw force-pushed the refactor/clustering_compat_check branch from 52a7a4a to 083b4a8 Compare April 3, 2023 01:40
@chronolaw
Copy link
Contributor Author

Updated to #10501.

@hbagdi hbagdi requested review from vm-001 and oowl April 5, 2023 19:54
@bungle
Copy link
Member

bungle commented Apr 6, 2023

I think it's the DP's responsibility to ensure it can work even if it receives a unexpected config, most of the time it just receives some unknown fields or so. DP should have been capable to protect itself, at the same time, from the perspective of the users. DP should have been capable to ensure there is no unexpected behavior for the users as well.

@liverpool8056,

I think it is CPs responsibility to send right config in a first place. The DPs should just put everything CP sends in LMDB. The DPs also already have ability to remove unknown fields on read. It would also be great if CP streams config in such a way that it could be easily be imported in LMDB, and not through all the unnecessary parsing, flattening, validation - so what you describe is what DPs already do, just that we need to remove the validation/flattening part when receiving config so that it does not error on it, of course this needs then to be backported to all supported lower versions. The current way we do this is backwards. We try to remove unknown fields on CP side while we still allow configuring them. You cannot ever solve this issue with it:

CP 1.0:
Imaginary plugin ACL-NEW has config.deny field.

DP 1.0:
Gets config from CP 1.0 and knows what to do with config.deny field.

CP upgraded to 1.1:
The ACL-NEW plugin gets new config.allow field.
Admin configures a new ACL-NEW plugin and sets config.allow

DP 1.0:
Gets config from CP 1.1, but CP removes the config.allow field from newly added ACL-NEW plugin, the ACL-NEW plugin does nothing on DP 1.0 and it is a serious security issue. (this cannot ever be solved with our current way of doing it). The DP cannot ever ensure it can work the right way in this situation. There is zero that DP can do, other than ignore or error. Our current strategy is to remove field on CP, which is about the same as DP ignoring errors. But a bit more visible to us, and more involving, and more error prone. Downgrading "never" works. Upgrading might, but we don't support newer DPs than CP, so you cannot upgrade DP beyond CP version.

DP 1.1:
Gets config from CP 1.1 and ACL-NEW plugin works.

On DBless it is worth to error on invalid config. just not on hybrid as there we should not have invalid configs in a first place, which means CP should use lowest (supported) common denominator when selecting schema version (which is set by the oldest supported connected DP, and when CP starts to support newer schema it should forbid older DPs connecting).

@bungle bungle merged commit d3d5839 into master Apr 6, 2023
@bungle bungle deleted the refactor/clustering_compat_check branch April 6, 2023 09:09
@hbagdi
Copy link
Member

hbagdi commented Apr 6, 2023

@chronolaw Please open the corresponding EE cherry pick PR.

@chronolaw
Copy link
Contributor Author

@chronolaw Please open the corresponding EE cherry pick PR.

Sure, but this PR need #10400 and #10501 to be cherry-picked first.

@hbagdi
Copy link
Member

hbagdi commented Apr 7, 2023

@chronolaw Please help out reviewing the cherrypicks of those PRs and then work on cherry-picking this one.

@outsinre
Copy link
Contributor

outsinre commented Apr 8, 2023

@chronolaw @hbagdi , as not yet manually cherry-picked. This will be cherry-picked by CE2EE merge PR: https://github.com/Kong/kong-ee/pull/4985.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core/clustering size/L version-compatibility Incompatibility with older data plane versions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants