Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply differential testing based on a reference implementation #60

Closed
danwt opened this issue Apr 22, 2022 · 10 comments
Closed

Apply differential testing based on a reference implementation #60

danwt opened this issue Apr 22, 2022 · 10 comments
Assignees
Labels
scope: testing Code review, testing, making sure the code is following the specification.

Comments

@danwt
Copy link
Contributor

danwt commented Apr 22, 2022

Idea

Use a reference implementation to generate correct states from actions

It might be feasible to take a property based testing approach. We could implement the relevant system in very simple, single threaded, in-memory, imperative code without any regards to performance (this is the 'model') and execute actions against it in order to get correct states for each action. The actions could be generated dumb-randomly or smart-randomly (like quickcheck ect) or something inbetween. Then we combine the actions and states, forming traces, and run the traces MBT style.

Note: there might be some existing tools we could use here, maybe even leveraging an existing PBT framework.

pbt

Wider context

I think this could be a good thing to do first as by implementing a reference implementation(s) I would learn a lot and this would help me write the TLA+ models later, which will be used for traditional verification (@mpoke) and also regular MBT. This work would also complement efforts to create a better test driver (see e.g. #58 (comment)) as a driver will be needed to execute traces. That interface of the driver could hook into e.g. a fork of the ibc-go testing framework.

(I discussed this idea with @jtremback earlier today).

wider context

Questions

The key question is whether it is cost effective to write a reference implementation. It must be protocol correct, of course! My feeling is that it will be very doable.

@danwt danwt self-assigned this Apr 22, 2022
@jtremback
Copy link
Contributor

I think that in reference to both this issue and #61, the important part will be to implement a driver that takes json traces and applies them to either a real network (like our integration tests) or the IBC-Go simulated chains.

This will be the bulk of the work, and from here it will be very easy to test in a variety of different ways, from hand written traces, to trivially randomized traces that just try to break invariants ("fuzzing"?), to property based or model based testing.

I would prioritize writing the driver, unless you think that the driver would be much easier with the knowledge gained from writing a reference implementation.

@jtremback
Copy link
Contributor

jtremback commented Apr 22, 2022

Actually, since we already have a driver for the integration tests, maybe an attempt to drive that with a reference implementation would be a good way to highlight any issues in the driver, and also help us learn what the limitations of tests on a real network are.

@mpoke
Copy link
Contributor

mpoke commented Apr 25, 2022

I like the idea of a reference implementation, but I don't know how difficult would be to make one that contains enough details to be relevant. For example, we could implement the entire IBC communication as a queue, but a straightforward implementation would not cover relaying delays.

@danwt
Copy link
Contributor Author

danwt commented Apr 25, 2022

Cool.

but a straightforward implementation would not cover relaying delays.

There could be an action 'deliver' to move a message from the queue to the chain so not calling that action might simulate a delay.

how difficult would be to make one that contains enough details to be relevant

Yeah that's an open question but one I'll have to solve to write a model checkable model anyway

@danwt
Copy link
Contributor Author

danwt commented Apr 26, 2022

@sainoe this is the best issue for describing the methodology we discussed in our meeting

@mpoke
Copy link
Contributor

mpoke commented Apr 26, 2022

The action space for CCV (i.e., external API):

  • on provider
    • delegate(D, V, A), where D is a delegator, V is a validator and A is an amount
    • undelegate(D, V, A)
    • redelegate(D, Vsrc, Vdst, A)
    • slash(V, infractionHeight, slashFactor)
    • jumpToBlock(h), where h is a height larger than the current height
  • on consumer
    • slash(V, infractionHeight, slashFactor)
    • jumpToBlock(h)

As discussed with @danwt, for now we assume that the channel initialization is established.

@mpoke mpoke added the scope: testing Code review, testing, making sure the code is following the specification. label Apr 26, 2022
@danwt
Copy link
Contributor Author

danwt commented May 4, 2022

The first application of this methodology is captured by two issues

  1. Write differential testing driver using ibc-go framework #87
  2. Write reference implementation to generate traces for differential testing #88

I will do (1) first. In my experience it is better to start with a minimal driver.

@danwt
Copy link
Contributor Author

danwt commented May 9, 2022

I like the idea of a reference implementation, but I don't know how difficult would be to make one that contains enough details to be relevant. For example, we could implement the entire IBC communication as a queue, but a straightforward implementation would not cover relaying delays.

From looking at what ibc-go/testing offers, and what hermes offer, in terms of testing it should be easy to model the network as a queue with pushOne(x) and popAll(), but it might be harder to implement popOne. This is because sendPacket on the sender will be like pushOne and UpdateClient on the recipient chain will do popAll() (same for hermes tx raw packet-recv).

@konnov
Copy link

konnov commented May 20, 2022

Hey @danwt, it just occurred to me that we might have missed the following option in the discussions. How about:

  1. Using TLC in the simulation mode to produce random traces,
  2. converting them to ITF and
  3. executing them with Atomkraft or your custom driver?

I don't see any drawbacks in comparison to PBT here. You would get non-determinism and expressiveness of TLA+ for free. Obviously, you would have the spec in TLA+, not a reference model in Python. By using the simulator, you can avoid the slowdown of TLC and Apalache, which we are experiencing with the full-scale model checking.

@cosmos cosmos deleted a comment from danwt May 25, 2022
@danwt danwt changed the title Apply 'property based testing' based on a reference implementation Apply differential testing based on a reference implementation Jun 6, 2022
@danwt
Copy link
Contributor Author

danwt commented Jun 8, 2022

Closed as PR

tackles this partially but now it is blocked by the TDD dev cycle. That is: the tests in the PR fail due to a the SUT not being fully implemented. I'm not sure if it makes sense to merge the PR currently, but further work is fine grained and deserves its own issue. Furthe more, this issue is not well written with closing criteria ect.

@danwt danwt closed this as completed Jun 8, 2022
Repository owner moved this from Todo to Done in Replicated Security Jun 8, 2022
mpoke added a commit that referenced this issue Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope: testing Code review, testing, making sure the code is following the specification.
Projects
No open projects
Status: Done
Development

No branches or pull requests

4 participants