Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Persistence Layer on top of PubSub #33

Merged
merged 28 commits into from
Aug 19, 2019

Conversation

aschmahmann
Copy link
Contributor

@aschmahmann aschmahmann commented Jul 3, 2019

This PR is essentially the other half of libp2p/go-libp2p-pubsub#190 which is meant to replace libp2p/go-libp2p-pubsub#171 as a means of having PubSub as an independent transport for a Best-Record-Wins Key-Value Store (ipfs/kubo#6447).

This also helps with libp2p/go-libp2p-pubsub#42, even though it is only a layer on-top of pubsub.


Brief summary from libp2p/go-libp2p-pubsub#190:

To have a persistence layer we need to:

  1. Control what messages are broadcast (i.e. rebroadcast messages that might've been lost in the network, as in gossipsub, and prevent mass rebroadcasting of identical messages)
  2. Give developers an "OnJoin" method that can retrieve the "Best Available Record" when first connecting to a peer

Instead of doing this by making routers extensible, we will be leaving router extensibility for a later date and solve the issues above by:

  1. Adding a mechanism for PubSub to expose when it connects to a new peer (Add the ability to handle newly subscribed peers go-libp2p-pubsub#190)
  2. (Ab)using stateful validators + application layer publishing to control which messages are broadcast
  3. Registering a protocol for handling the "OnJoin" method for a given topic that gives peers the "Best Available Record"

This PR deals with 2 and 3.


Note this PR is blocked by libp2p/go-libp2p-pubsub#190 and to a lesser extent libp2p/go-libp2p-pubsub#184

@Stebalien @raulk @vyzo

pubsub.go Outdated Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
pubsub.go Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
log.Debugf("PubsubResolve: subscribed to %s", key)

if !bootstraped {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is blocked by libp2p/go-libp2p-pubsub#184

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put bootstrapping logic back in to sidestep this for now

pubsub.go Outdated Show resolved Hide resolved
@aschmahmann aschmahmann requested review from Stebalien, raulk and vyzo July 22, 2019 13:45
@aschmahmann aschmahmann self-assigned this Jul 22, 2019
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial pass

pb/message.proto Outdated Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
pubsub.go Outdated
return p.ps.Publish(topic, value)
done := make(chan error, 1)
go func() {
done <- p.ps.Publish(topic, value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Publish is really going to need to take a context: libp2p/go-libp2p-pubsub#184 (comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, removing all the bootstrapping code makes #28 really obvious. It also makes it easy to see that Publish 's lack of a context while potentially workable is quite problematic if we expect Publish to be blocking after libp2p/go-libp2p-pubsub#184 lands.

pubsub.go Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
… response can have nil data, use standard gogo protobuf tools).

Changed the protocol string for the get-latest protocol.
Separated out get-latest protocol and gave it explicit tests.
Added more tests and modified an existing one to take into account the use of the get-latest protocol.
Update go.mod to use newer (still forked) version of pubsub that has better peer event handling.
Changed rebroadcast to use one goroutine instead of one per topic (also upped the rebroadcast interval).
* Fixed existing issue with hanging on failed pubsub subscription.
Better handling of contexts.
Misc. code cleanup.
pubsub.go Show resolved Hide resolved
@aschmahmann
Copy link
Contributor Author

func (p *PubsubValueStore) GetValue(ctx context.Context, key string, opts ...routing.Option) ([]byte, error) {
if err := p.Subscribe(key); err != nil {
return nil, err
}
return p.getLocal(key)
}

Can't reference this since I haven't changed it in this PR, but there's two potential oddities here.

  1. getLocal calls a datastore which is frequently an IO operation, but takes no context.
  2. Should GetValue be blocking? It's not even that unreasonable to do this since the get-latest protocol means that if we wait for a little bit after we gain peers via bootstrapping we should get the latest value the somebody connected to us has.

Fine punting these (especially 2) until later if we want since it's potentially more complicated and I'm not sure if there are many consumers of GetValue anyway. Feel free to signal punt with an 👀 emoji.

getlatest.go Outdated Show resolved Hide resolved
getlatest.go Outdated Show resolved Hide resolved
getlatest.go Outdated Show resolved Hide resolved
getlatest.go Outdated Show resolved Hide resolved
getlatest.go Outdated Show resolved Hide resolved
getlatest.go Outdated Show resolved Hide resolved
Wrapped protocol read/writes in contexts.
A little refactoring.
Protobuf Makefile builds on Windows too.
@aschmahmann
Copy link
Contributor Author

I don't believe the \ has any effect in this scenario (on Windows)

Without it I was running into issues where all my \s in my GOPATH were getting collapsed from C:\Users\a\b\c to C:Usersabc. However, adding quotation marks fixes it too. Pushed 😃

pb/Makefile Outdated Show resolved Hide resolved
@djdv
Copy link

djdv commented Aug 9, 2019

Makefile portion:
Works on my machine seal of approval. 👍

See: #33 (comment)

image

getlatest.go Outdated
host host.Host
}

func newGetLatestProtocol(ctx context.Context, host host.Host, getLocal func(key string) ([]byte, error)) *getLatestProtocol {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably make getLocal part of the type, given its usage pattern.
Also, better to typedef the function type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typedef 👍 (or an interface would also be fine)
part of the type 👎

If you'd prefer passing in an interface we can do that, but there's no reason to tie this to any particular implementation of "get stuff". We use a data store for this in the rest of this library, but data store has put + get, and we only need get.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to be part of the type i think...
no need for an interface, just a function typedef will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, although we'll need to revisit this if/when we expose this function publicly since an interface will likely be more appropriate.

pubsub.go Outdated Show resolved Hide resolved
getlatest.go Outdated Show resolved Hide resolved
getlatest.go Outdated Show resolved Hide resolved
getlatest.go Outdated Show resolved Hide resolved
getlatest.go Outdated Show resolved Hide resolved
pb/message.proto Outdated Show resolved Hide resolved
pubsub.go Show resolved Hide resolved
@djdv
Copy link

djdv commented Aug 12, 2019

Makefile portion:
Works on my machine seal of approval. 👍

I'm rescinding this for now and will post something after testing.

There needs to be some shenanigans around checking if we're using the win32 protoc or the mingw one, which influences what style of separator and path style protoc expects, regardless of what OS we're on. Reference

Edit:
Let's not block on this though. For now, using multiple proto_path args is the best temporary solution and fixes the builds at least when using Windows native protoc. The point of which is to avoid the need for a separator altogether.

A patch can come in a later PR that fixes support for other versions of protoc that are likely to be installed in a Windows environment.

@aschmahmann aschmahmann marked this pull request as ready for review August 13, 2019 21:41
pubsub.go Outdated Show resolved Hide resolved
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the protocol name and the replace directive. There's probably some cleanup we could do after all this refactoring but let's do that in a followup patch. This PR has >100 comments and >1000 lines.

pb "github.com/libp2p/go-libp2p-pubsub-router/pb"
)

const FetchProtoID = protocol.ID("/libp2p/fetch/0.0.1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/libp2p/record-fetch/1 (or something). "fetch" is too generic (does it fetch blocks? keys? everything?). Users are going to try to use it to fetch IPLD blocks and be very confused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I figured that it would just fetch anything (i.e. []byte), and if for some reason someone wanted to configure their nodes to accept /ipfs/bafyABC via fetch and return a block that's something that would be up to them to do. This would also make it easier for people to reuse this protocol within or on top of other protocols.

Is that too generic/confusing for people? Should we just restrict it to records (i.e. things with validators)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually really like this. +1

fetch.go Outdated
}
}

func (p fetchProtocol) Get(ctx context.Context, pid peer.ID, key string) ([]byte, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: inconsistent receiver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think this is fixed/better now. Just called it Fetch.

pubsub.go Show resolved Hide resolved
pubsub.go Outdated
return nil
}

func (p *PubsubValueStore) rebroadcast(ctx context.Context) {
time.Sleep(p.rebroadcastInitialDelay)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit for later: this should use a select and a time.After.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, although if we really wanted we might be able to up efficiency slightly by reusing a Timer instead of a ticker.

go.mod Outdated
@@ -13,3 +14,5 @@ require (
github.com/libp2p/go-libp2p-routing-helpers v0.1.0
github.com/libp2p/go-libp2p-swarm v0.1.0
)

replace github.com/libp2p/go-libp2p-pubsub v0.1.0 => github.com/aschmahmann/go-libp2p-pubsub v0.0.4-0.20190807152749-d7996289bbcd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now uses a version of libp2p/pubsub from master. However, there hasn't been a release yet so it's referencing v0.1.1-0.20190807100218-9f04364996b4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants