Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use block decoding framework #4040

Closed
wants to merge 2 commits into from
Closed

Conversation

Stebalien
Copy link
Member

Change IPFS to use the new pluggable Block to IPLD decoding framework.

Later, I'll pull the other node formats out of IPFS (at least the raw one).

Concern: As-is, it's non-obvious that importing go-ipld-cbor automagically makes
node.Decode decode CBOR nodes. Worse, it's non-obvious where we should be
importing this package (it really only needs to be imported once).

Change IPFS to use the new pluggable Block to IPLD decoding framework.

License: MIT
Signed-off-by: Steven Allen <[email protected]>
@Kubuxu Kubuxu added the status/in-progress In progress label Jul 5, 2017
@whyrusleeping
Copy link
Member

Concern: As-is, it's non-obvious that importing go-ipld-cbor automagically makes
node.Decode decode CBOR nodes. Worse, it's non-obvious where we should be
importing this package (it really only needs to be imported once).

Yeah, I'm actually in favor of having to call a sort of 'Register' function on the import to do the setup. That makes things more explicit, and potentially less error prone (I'm leaning towards this for the plugin stuff in #4033 )

@whyrusleeping
Copy link
Member

@Stebalien could you make codeclimate happy?

License: MIT
Signed-off-by: Steven Allen <[email protected]>
@Stebalien
Copy link
Member Author

could you make codeclimate happy?

Lovely... rust has the exact opposite convention: don't include the function's name in the documentation because it's obvious...

@whyrusleeping
Copy link
Member

Yeah, its done that way to force you to phrase your documentation consistently. Ends up looking and reading nicely in godoc: https://golang.org/pkg/net/http/#Serve

@Stebalien
Copy link
Member Author

Consistency is good, I'm just lamenting at the 14+1 conflicting standards.

Yeah, I'm actually in favor of having to call a sort of 'Register' function on the import to do the setup.

As pointed out in that PR, you still need a global. This means you either need to synchronize access to the global or be very careful to call register on startup only. The nice thing about the current implicit system is that all registrations are made in init functions (and it's very clear that registering late is unsafe).

Possible solutions:

  1. We could not provide a package-level Decode(block Block) function (or a DefaultBlockDecoder) in go-ipld-format. Instead, we'd could do all this setup in IPFS and have IPFS pass around a BlockDecoder. However, that's messy.
  2. We could protect the DefaultBlockDecoder variable with a sync.Once. If DefaultBlockDecoder isn't initialized on the first call to Decode, initialize it to an empty map and return errors from then on.
  3. We could make DefaultBlockDecoder initialize-once (using an atomic pointer).
  4. Make DefaultBlockDecoder RCU (read-copy-update): that is, if we want to update it, we'd load it using an atomic pointer, copy the map, update our copy, and then write it back with a compare and swap. In the common, sane case, there should be no contention.

@whyrusleeping
Copy link
Member

I think as long as we are clear that the user must do all registration in the global defaultblockdecoder i think its fine to keep it as is. All the other solutions to 'protect the user' feel pretty messy

@Kubuxu
Copy link
Member

Kubuxu commented Jul 10, 2017

There is also use case of plugins: #4033

Currently they are just hacking their way into the DefaultBlockDecoder, for now I think simple registration method and Getter protected by RWMutex would be ok.

With Go 1.9 we will get concurrent maps which will replace the RWMutex in this place.

@Kubuxu
Copy link
Member

Kubuxu commented Jul 10, 2017

I would also be for not making that global functions but a structure with functions. So that during testing or in future we can use it as non-global.

@Stebalien
Copy link
Member Author

for now I think simple registration method and Getter protected by RWMutex would be ok.

So, protect the user or not? You and @whyrusleeping appear to be in disagreement.

I would also be for not making it global function but a structure. So that during testing or in future we can use it as non-global.

Hm. The problem there is that you'd need to feed this structure to everything that needs to be able to decode blocks. The current way to test this is to edit the global DefaultBlockDecoder. While I agree that's suboptimal, this is the fundamental to global or not to global question.

My current use-case for this is floodsub; I need to be able to decode IPLD nodes. Requiring an IPLD block decoder in the constructor is a bit messy. It also means that different parts of the codebase can end up with different block decoders and blocks understood by some parts may not be understood by all parts.


If we do decide to "protect the user", I plan on changing the BlockDecoder type (currently defined as a map) to an interface as follows:

type BlockDecoder interface {
    Register(codec uint64, decoder DecodeBlockFunc)
    Decode(blocks.Block) (Node, error)
}
type safeBlockDecoder struct {
    lock sync.RWMutex // Can be replaced with atomic pointer magic as an optimization later.
    decoders map[uint64]DecodeBlockFunc
}

func (d *safeBlockDecoder) Register(codec uint64, decoder DecodeBlockFunc) {
    defer lock.Lock().Unlock()
    d.decoders[codec] = decoder
}

func (d *safeBlockDecoder) Decode(block blocks.Block) (nodes.Node, error) {
	// Short-circuit by cast if we already have a Node.
	if node, ok := block.(Node); ok {
		return node, nil
	}

	ty := block.Cid().Type()

        locker := d.lock.RLock()
	decoder, ok := b[ty];
        locker.Unlock()

        if ok {
		return decoder(block)
	} else {
		// TODO: get the *long* name for this format
		return nil, fmt.Errorf("unrecognized object type: %d", ty)
	}
}

var DefaultBlockDecoder BlockDecoder = &safeBlockDecoder{ decoders: make(map[uint64]DecodeBlockFunc }

// + global functions that call into DefaultBlockDecoder.

@Kubuxu
Copy link
Member

Kubuxu commented Jul 10, 2017

Plugins break the premise that everything is initialized in init() so I think we need to protect it. The concurrent maps will provide us with low overhead primitive for that.

Hm. The problem there is that you'd need to feed this structure to everything that needs to be able to decode blocks. The current way to test this is to edit the global DefaultBlockDecoder. While I agree that's suboptimal, this is the fundamental to global or not to global question.

For now I would be OK with everything using it as global, there are just places where having an instance makes more sense.

@Stebalien
Copy link
Member Author

So I'll go with the code I posted above for now.

The concurrent maps will provide us with low overhead primitive for that.

For our use-case, I think an RCU using an atomic pointer would be faster than anything in the standard library given our use-case (unless go's concurrent maps do some really fancy static-analysis). The downside is that doing fancy things with atomic pointers can lead to fancy bugs.

@Kubuxu
Copy link
Member

Kubuxu commented Jul 10, 2017

It is always that trade off.

(I have version of bblom code that uses atomic ops and CAS, it is faster but current overhead is not high enough to risk possible bugs in it).

@whyrusleeping
Copy link
Member

I'm really leaning towards not protecting it, even with plugins things should register in serial and all up front.

If you feel strongly that we should protect it @Kubuxu then lets do it, just make it clean.

@Kubuxu
Copy link
Member

Kubuxu commented Jul 11, 2017

There is a risk in not protecting it (single miss behaving piece of code can cause application panic), so I would say, let's protect it. If the cost becomes to high, we can decide to change it to RCU (which should have very low cost).

@whyrusleeping
Copy link
Member

Lets just make sure to implement it in a way that doesnt block us from doing block decodes in parallel

@Stebalien
Copy link
Member Author

ipfs/go-ipld-format#22

@Stebalien
Copy link
Member Author

Superseded by #4060

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants