Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changefeed implementation #1019

Merged
merged 13 commits into from
Nov 14, 2016
Merged

Conversation

endophage
Copy link
Contributor

@endophage endophage commented Oct 28, 2016

Only works for the supported relational databases. No intent to implement for RethinkDB in this PR.

Closes #436

Implementation (some of this needs to be done/tweaked, only basic global feed implemented at time of writing and some query string params need to be brought inline with the below):

A changefeed can be requested for a specific repo, or globally. Query string params are change_id and records, providing a starting index (exclusive) from which to query, and the number of results to return. The change_id may be -1 indicating to start from the most recent change, and in this case, we will assume any value passed to records should be considered negative. records may be a negative number indicating that number of records ordered before the index should be returned. URL for the feed is /v2/changefeed/_trust, the final _trust segment being included to keep compatibility with existing url routing for running notary beside registry.

Signed-off-by: David Lawrence [email protected] (github: endophage)

@endophage endophage force-pushed the changefeed branch 2 times, most recently from a437971 to f4f5f49 Compare October 31, 2016 18:16
@endophage
Copy link
Contributor Author

Also need to add image name filtering. Only going to implement global and single repo, not wildcarded docker.io/foo/*. Tests should pass this run.

@ecordell
Copy link
Contributor

@endophage This could be a great way to implement the historic fetching for #942 backwards-compatibility. Is that intentional or a happy coincidence?

@endophage
Copy link
Contributor Author

endophage commented Oct 31, 2016

@ecordell this is going to provide versions over timestamps (because they represent actual published versions of the repo). I'd guess it would still be more efficient to query for root files by version, which are likely to change much less frequently.

I'm thinking more and more it may also be worth implementing a hard requirement server side that updates increase versions of each role monotonically so we can guarantee "root.X" exists for all 0 < X < latest root version.

@endophage
Copy link
Contributor Author

endophage commented Oct 31, 2016

OK, auth and image name filtering (single repo only, no wildcarding of partial names) implemented. Circle won't pass until I get the new migration files committed when I'm back at my desktop.

@endophage endophage force-pushed the changefeed branch 5 times, most recently from 489f2f4 to 77f5957 Compare October 31, 2016 22:23
Signed-off-by: David Lawrence <[email protected]> (github: endophage)

// CreateChangefeedTable creates the DB table for Changefeed
func CreateChangefeedTable(db gorm.DB) error {
query := db.Set("gorm:table_options", "ENGINE=InnoDB DEFAULT CHARSET=utf8").CreateTable(&Changefeed{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want any indexes on gun and/or version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible we may want an index on gun as we'll filter by gun for single repo updates. No need for an index on version though as unless something is desperately wrong, it'll order the same way as the id and we shouldn't be querying the changefeed table for a specific version.

imageName = "*"
}
pageSizeStr := qs.Get("page_size")
pageSize, err := strconv.ParseInt(pageSizeStr, 10, 32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use strconv.ParseUint, which will error on negative input

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


// if reversed, we need to flip the list order so oldest is first
if reversed {
for i, j := 0, len(changes)-1; i < j; i, j = i+1, j-1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might be worth factoring this into utils but leaving this here is fine for now.

r.Methods("GET").Path("/v2/changefeed/_trust").Handler(createHandler(_serverEndpoint{
OperationName: "Changefeed",
ServerHandler: handlers.Changefeed,
PermissionsRequired: []string{"*"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this just be pull? Since pull permission are enough to get TUF metadata in the first place?

@@ -8,6 +8,10 @@ import (
"strings"
"sync"
"time"

"strconv"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: move this up to the golang import blocks

// change. The number of results returned is limited by pageSize.
// Reversed indicates we are fetching pages going backwards in time, the
// default being to fetch pageSize from changeID going forwards in time.
GetChanges(changeID string, pageSize int, filterName string, reversed bool) ([]Change, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove reversed from the signature since we do the reverse sorting in the server handler wrapper itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed down the ordering into the storage implementations. While it makes it more duplicated (exists in memory and sql stores) it also feels like the interface is more consistent.

// are equal, therefore, we want to return results starting at index
// changeID+1 to match the exclusivity of the interface definition.
func (st *MemStorage) GetChanges(changeID string, pageSize int, filterName string, reversed bool) ([]Change, error) {
id, err := strconv.ParseInt(changeID, 10, 32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParseUint here too? Generally wondering if we should change pageSize to a uint in signatures too

Copy link
Contributor Author

@endophage endophage Nov 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pageSize can be ParseUint, but the changeID can be -1 to indicate "start from the most recent change", in which case, I'm going to add a check to ensure if the changeID is -1 the user has also requested the results to be reversed, otherwise it doesn't make sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that makes more sense, sounds good 👍

}
}

func serveError(log ctxu.Logger, w http.ResponseWriter, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for factoring this out 👍

requiredAccess = append(requiredAccess, auth.Access{
Resource: auth.Resource{
Type: "registry",
Name: "catalog",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is catalog in this context? All images? Does notary know how to handle this already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The auth lib and the token servers know how to handle this. According to @jlhawn this is the correct resource to request for listing all repos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After more discussion with @jlhawn, this resource should always be requested with the "*" permission, so I've hard coded that into this function. The configured "pull" action will now only be used when requesting access to the changefeed of a specific repo.

imageName = "*"
}
return imageName, nil
case NoImageName:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm wondering if this should be renamed to something else, like NoImageUsed? I'm proposing this because of the if imageName == "" in the case above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't indicate no image was passed, it indicates no image was expected. The imageName == "" above is to set a default value when we expected an image name and none was explicitly given (and in the changefeed situation I put that in for, it's desirable to make that assumption when no name is given).

I opened issue #1021 to look at refactoring how we do auth in general because this is only one of the things I wasn't very happy with, regardless of how it's named.

…clude seeding changefeed table

Signed-off-by: David Lawrence <[email protected]> (github: endophage)
@endophage endophage force-pushed the changefeed branch 4 times, most recently from fa99957 to a89d6f6 Compare November 3, 2016 18:07
Signed-off-by: David Lawrence <[email protected]> (github: endophage)
David Lawrence added 2 commits November 4, 2016 11:24
Signed-off-by: David Lawrence <[email protected]> (github: endophage)
Signed-off-by: David Lawrence <[email protected]> (github: endophage)
@endophage endophage force-pushed the changefeed branch 4 times, most recently from c1e7530 to c3b68a0 Compare November 11, 2016 21:27
)

const (
changeCategoryUpdate = "update"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: Can be done in the future when we actually have a consumer for this as a library, but should these be a type and exported, so someone using the API can check one against the other?

id string
}

func (err ErrBadChangeID) Error() string {
Copy link
Contributor

@riyazdf riyazdf Nov 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I had commented somewhere previously but did we want to use this for the case when a user requested a yet-to-exist change ID? We should just delete this otherwise

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the change ID is an index, I treated it as essentially "there's nothing in the range you requested", and returned an empty result rather than an error. Especially in the database cases, it's actually much easier to do that than run extra queries to determine if the ID is within the existing range.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok got it, that's fine. Up to you whether you'd like to delete this code (since it's unused) or leave it for future use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, not sure how I ended up not using this... I know I did at some point. I need to make one more tiny tweak to tell the JSON encoding to return the change ID as a string so I'll take this out at the same time.

Data: []byte{},
},
}))
require.NoError(t, s.UpdateMany("busybox", []MetaUpdate{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was wondering if we can insert a Delete here or after the busybox updates? So that we can test that the delete events get returned, etc. Also, I think this is the same as the sql tests? (creating the same changes, testing the change feed, etc.) - would it make sense to move the implementation into tuf_store_test.go (or we can rename that file or something) as testGetChanges, and then invoke it with either a memory store or a sql store?

Similarly, can we add a Delete of a GUN that doesn't exist to the end to ensure that no change object is added in that case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe a s.UpdateCurrent with a timestamp update, and s.UpdateCurrent with a snapshot update? To ensure the timestamp update creates a change log, but the snapshot update does not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops sorry, I didn't look far enough - not tuf_store_test - storage_test. Looks like we have a delete test in there - could add the check of the change feed in that test instead of in this one. There is also a rollback test in there - could we add an assertion that in the rollback case, no change log is created for that update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, done, and done! :-D

@cyli
Copy link
Contributor

cyli commented Nov 12, 2016

@HuKeping hmmm, I've just realized we don't actually have server API docs... Docker is also in the process of changing how it does docs and we need to setup a (Github Pages) microsite for Notary (which will contain notary specific docs with no relationship to the docker integration

Maybe file an issue to track this?

@HuKeping #796 from previous :)

@endophage Thank you for all your work on this! It LGTM aside from a couple tests that would be awesome to add.

@codecov-io
Copy link

codecov-io commented Nov 13, 2016

Current coverage is 75.15% (diff: 74.86%)

Merging #1019 into master will decrease coverage by 4.62%

@@             master      #1019   diff @@
==========================================
  Files            85         92      +7   
  Lines         11222      12542   +1320   
  Methods           0          0           
  Messages          0          0           
  Branches          0          0           
==========================================
+ Hits           8953       9426    +473   
- Misses         1732       2493    +761   
- Partials        537        623     +86   

Powered by Codecov. Last update 3ca7b72...a36df0e

@endophage endophage force-pushed the changefeed branch 4 times, most recently from 2337731 to 5387fd5 Compare November 13, 2016 07:44
Copy link
Contributor

@riyazdf riyazdf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small nit but otherwise LGTM! Awesome work on this :)

id string
}

func (err ErrBadChangeID) Error() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok got it, that's fine. Up to you whether you'd like to delete this code (since it's unused) or leave it for future use.

Copy link
Contributor

@cyli cyli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@HuKeping
Copy link
Contributor

I'd like to take a final review to it if it was not in a hurry to merge

}

func changefeed(logger ctxu.Logger, store storage.MetaStore, imageName, changeID string, records int64) ([]byte, error) {
changes, err := store.GetChanges(changeID, int(records), imageName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about the int64 -> int, what about the records is over 2^32 - 1 and unfortunately we are on a 32 machine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I parse it from the query string I specifically parse as a 32 bit. It's not great but strconv.ParseInt always returned an int64 :-(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, I see the strconv.ParseInt , why not cast it in the function checkChangefeedInputs where we do the strconv.ParseInt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could, I don't have any particular reason why it ended up being done here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build is has been so flaky the last 48 hours I'm mildly afraid of making any more changes. I've filed issues to look at CI. Unless you feel really strongly about this I'm not going to change it with the tight timeline.

Copy link
Contributor

@HuKeping HuKeping Nov 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing more at present , I think we can do it in the followed PRs if we would like to change anything. Just go and get some sleep lol

pageSize, err = strconv.ParseInt(r, 10, 32)
if err != nil {
logger.Errorf("400 GET invalid pageSize: %s", r)
err = errors.ErrInvalidParams.WithDetail("invalid records parameter, must be an integer >= 0")
Copy link
Contributor

@HuKeping HuKeping Nov 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the err will explain itself well, what we do now will lose the original error message which I think is worth to keep, for example it will return "value out of range" if the provided value is out of range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to include the original error string. We need to add some detail otherwise it's not clear to the user which parameter was wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you are right 👍

// Reversed indicates we are fetching pages going backwards in time, the
// default being to fetch pageSize from changeID going forwards in time.
// The returned []Change should always be ordered oldest to newest.
GetChanges(changeID string, records int, filterName string) ([]Change, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comments block needs some update, the pageSize is now records, the reversed has been removed, etc.

Signed-off-by: David Lawrence <[email protected]> (github: endophage)
@HuKeping
Copy link
Contributor

LGTM, thanks for all your work on this @endophage !

@endophage endophage merged commit c0cada4 into notaryproject:master Nov 14, 2016
@endophage endophage deleted the changefeed branch November 14, 2016 16:51
@endophage
Copy link
Contributor Author

@HuKeping @riyazdf @cyli Thanks for all the great review guys!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants