-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changefeed implementation #1019
Conversation
a437971
to
f4f5f49
Compare
Also need to add image name filtering. Only going to implement global and single repo, not wildcarded |
@endophage This could be a great way to implement the historic fetching for #942 backwards-compatibility. Is that intentional or a happy coincidence? |
@ecordell this is going to provide versions over timestamps (because they represent actual published versions of the repo). I'd guess it would still be more efficient to query for root files by version, which are likely to change much less frequently. I'm thinking more and more it may also be worth implementing a hard requirement server side that updates increase versions of each role monotonically so we can guarantee "root.X" exists for all 0 < X < latest root version. |
OK, auth and image name filtering (single repo only, no wildcarding of partial names) implemented. Circle won't pass until I get the new migration files committed when I'm back at my desktop. |
489f2f4
to
77f5957
Compare
Signed-off-by: David Lawrence <[email protected]> (github: endophage)
|
||
// CreateChangefeedTable creates the DB table for Changefeed | ||
func CreateChangefeedTable(db gorm.DB) error { | ||
query := db.Set("gorm:table_options", "ENGINE=InnoDB DEFAULT CHARSET=utf8").CreateTable(&Changefeed{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want any indexes on gun and/or version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible we may want an index on gun as we'll filter by gun for single repo updates. No need for an index on version though as unless something is desperately wrong, it'll order the same way as the id and we shouldn't be querying the changefeed table for a specific version.
imageName = "*" | ||
} | ||
pageSizeStr := qs.Get("page_size") | ||
pageSize, err := strconv.ParseInt(pageSizeStr, 10, 32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use strconv.ParseUint
, which will error on negative input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
// if reversed, we need to flip the list order so oldest is first | ||
if reversed { | ||
for i, j := 0, len(changes)-1; i < j; i, j = i+1, j-1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: might be worth factoring this into utils
but leaving this here is fine for now.
r.Methods("GET").Path("/v2/changefeed/_trust").Handler(createHandler(_serverEndpoint{ | ||
OperationName: "Changefeed", | ||
ServerHandler: handlers.Changefeed, | ||
PermissionsRequired: []string{"*"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this just be pull
? Since pull permission are enough to get TUF metadata in the first place?
@@ -8,6 +8,10 @@ import ( | |||
"strings" | |||
"sync" | |||
"time" | |||
|
|||
"strconv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: move this up to the golang import blocks
// change. The number of results returned is limited by pageSize. | ||
// Reversed indicates we are fetching pages going backwards in time, the | ||
// default being to fetch pageSize from changeID going forwards in time. | ||
GetChanges(changeID string, pageSize int, filterName string, reversed bool) ([]Change, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove reversed
from the signature since we do the reverse sorting in the server handler wrapper itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed down the ordering into the storage implementations. While it makes it more duplicated (exists in memory and sql stores) it also feels like the interface is more consistent.
// are equal, therefore, we want to return results starting at index | ||
// changeID+1 to match the exclusivity of the interface definition. | ||
func (st *MemStorage) GetChanges(changeID string, pageSize int, filterName string, reversed bool) ([]Change, error) { | ||
id, err := strconv.ParseInt(changeID, 10, 32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ParseUint
here too? Generally wondering if we should change pageSize
to a uint
in signatures too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pageSize
can be ParseUint
, but the changeID
can be -1
to indicate "start from the most recent change", in which case, I'm going to add a check to ensure if the changeID
is -1
the user has also requested the results to be reversed, otherwise it doesn't make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok that makes more sense, sounds good 👍
} | ||
} | ||
|
||
func serveError(log ctxu.Logger, w http.ResponseWriter, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for factoring this out 👍
requiredAccess = append(requiredAccess, auth.Access{ | ||
Resource: auth.Resource{ | ||
Type: "registry", | ||
Name: "catalog", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is catalog
in this context? All images? Does notary know how to handle this already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The auth lib and the token servers know how to handle this. According to @jlhawn this is the correct resource to request for listing all repos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After more discussion with @jlhawn, this resource should always be requested with the "*" permission, so I've hard coded that into this function. The configured "pull" action will now only be used when requesting access to the changefeed of a specific repo.
imageName = "*" | ||
} | ||
return imageName, nil | ||
case NoImageName: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'm wondering if this should be renamed to something else, like NoImageUsed
? I'm proposing this because of the if imageName == ""
in the case above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't indicate no image was passed, it indicates no image was expected. The imageName == ""
above is to set a default value when we expected an image name and none was explicitly given (and in the changefeed situation I put that in for, it's desirable to make that assumption when no name is given).
I opened issue #1021 to look at refactoring how we do auth in general because this is only one of the things I wasn't very happy with, regardless of how it's named.
…clude seeding changefeed table Signed-off-by: David Lawrence <[email protected]> (github: endophage)
fa99957
to
a89d6f6
Compare
Signed-off-by: David Lawrence <[email protected]> (github: endophage)
Signed-off-by: David Lawrence <[email protected]> (github: endophage)
Signed-off-by: David Lawrence <[email protected]> (github: endophage)
c1e7530
to
c3b68a0
Compare
) | ||
|
||
const ( | ||
changeCategoryUpdate = "update" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking: Can be done in the future when we actually have a consumer for this as a library, but should these be a type and exported, so someone using the API can check one against the other?
id string | ||
} | ||
|
||
func (err ErrBadChangeID) Error() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I had commented somewhere previously but did we want to use this for the case when a user requested a yet-to-exist change ID? We should just delete this otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the change ID is an index, I treated it as essentially "there's nothing in the range you requested", and returned an empty result rather than an error. Especially in the database cases, it's actually much easier to do that than run extra queries to determine if the ID is within the existing range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok got it, that's fine. Up to you whether you'd like to delete this code (since it's unused) or leave it for future use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, not sure how I ended up not using this... I know I did at some point. I need to make one more tiny tweak to tell the JSON encoding to return the change ID as a string so I'll take this out at the same time.
Data: []byte{}, | ||
}, | ||
})) | ||
require.NoError(t, s.UpdateMany("busybox", []MetaUpdate{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was wondering if we can insert a Delete
here or after the busybox updates? So that we can test that the delete events get returned, etc. Also, I think this is the same as the sql tests? (creating the same changes, testing the change feed, etc.) - would it make sense to move the implementation into tuf_store_test.go
(or we can rename that file or something) as testGetChanges
, and then invoke it with either a memory store or a sql store?
Similarly, can we add a Delete
of a GUN that doesn't exist to the end to ensure that no change object is added in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe a s.UpdateCurrent
with a timestamp update, and s.UpdateCurrent
with a snapshot update? To ensure the timestamp update creates a change log, but the snapshot update does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops sorry, I didn't look far enough - not tuf_store_test
- storage_test
. Looks like we have a delete test in there - could add the check of the change feed in that test instead of in this one. There is also a rollback test in there - could we add an assertion that in the rollback case, no change log is created for that update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, done, and done! :-D
@HuKeping #796 from previous :) @endophage Thank you for all your work on this! It LGTM aside from a couple tests that would be awesome to add. |
c3b68a0
to
622d4ed
Compare
Current coverage is 75.15% (diff: 74.86%)@@ master #1019 diff @@
==========================================
Files 85 92 +7
Lines 11222 12542 +1320
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 8953 9426 +473
- Misses 1732 2493 +761
- Partials 537 623 +86
|
2337731
to
5387fd5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one small nit but otherwise LGTM! Awesome work on this :)
id string | ||
} | ||
|
||
func (err ErrBadChangeID) Error() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok got it, that's fine. Up to you whether you'd like to delete this code (since it's unused) or leave it for future use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
5387fd5
to
7e697de
Compare
I'd like to take a final review to it if it was not in a hurry to merge |
} | ||
|
||
func changefeed(logger ctxu.Logger, store storage.MetaStore, imageName, changeID string, records int64) ([]byte, error) { | ||
changes, err := store.GetChanges(changeID, int(records), imageName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried about the int64
-> int
, what about the records is over 2^32 - 1 and unfortunately we are on a 32 machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I parse it from the query string I specifically parse as a 32 bit. It's not great but strconv.ParseInt
always returned an int64
:-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I see the strconv.ParseInt
, why not cast it in the function checkChangefeedInputs
where we do the strconv.ParseInt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could, I don't have any particular reason why it ended up being done here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build is has been so flaky the last 48 hours I'm mildly afraid of making any more changes. I've filed issues to look at CI. Unless you feel really strongly about this I'm not going to change it with the tight timeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing more at present , I think we can do it in the followed PRs if we would like to change anything. Just go and get some sleep lol
pageSize, err = strconv.ParseInt(r, 10, 32) | ||
if err != nil { | ||
logger.Errorf("400 GET invalid pageSize: %s", r) | ||
err = errors.ErrInvalidParams.WithDetail("invalid records parameter, must be an integer >= 0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the err
will explain itself well, what we do now will lose the original error message which I think is worth to keep, for example it will return "value out of range" if the provided value is out of range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to include the original error string. We need to add some detail otherwise it's not clear to the user which parameter was wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah you are right 👍
// Reversed indicates we are fetching pages going backwards in time, the | ||
// default being to fetch pageSize from changeID going forwards in time. | ||
// The returned []Change should always be ordered oldest to newest. | ||
GetChanges(changeID string, records int, filterName string) ([]Change, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comments block needs some update, the pageSize
is now records
, the reversed
has been removed, etc.
7e697de
to
842dcf1
Compare
Signed-off-by: David Lawrence <[email protected]> (github: endophage)
842dcf1
to
d1440f6
Compare
LGTM, thanks for all your work on this @endophage ! |
Only works for the supported relational databases. No intent to implement for RethinkDB in this PR.
Closes #436
Implementation (
some of this needs to be done/tweaked, only basic global feed implemented at time of writing and some query string params need to be brought inline with the below):A changefeed can be requested for a specific repo, or globally. Query string params are
change_id
andrecords
, providing a starting index (exclusive) from which to query, and the number of results to return. Thechange_id
may be-1
indicating to start from the most recent change, and in this case, we will assume any value passed torecords
should be considered negative.records
may be a negative number indicating that number of records ordered before the index should be returned. URL for the feed is/v2/changefeed/_trust
, the final_trust
segment being included to keep compatibility with existing url routing for running notary beside registry.Signed-off-by: David Lawrence [email protected] (github: endophage)