Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to defer field resolution (to solve N+1 problem) #66

Closed
vladar opened this issue Nov 4, 2016 · 16 comments
Closed

Ability to defer field resolution (to solve N+1 problem) #66

vladar opened this issue Nov 4, 2016 · 16 comments

Comments

@vladar
Copy link
Member

vladar commented Nov 4, 2016

There were several discussions about ways to solve N+1 problem in GraphQL - #42, #60.

There are two classes of solutions for this problem:

  1. Look-ahead soltion when we analyze the query and pre-fetch required data upfront (see Introduce query plan as a replacement for ResolveInfo::getFieldSelection #65)
  2. Defer field resolution (and thus enable batching queries using techniques like DataLoader)

Deferring syntax:

'resolve' => function($obj) {
    $this->dataLoader->buffer($obj->someId);

    return new GraphQL\Deferred(function() use ($obj) {
       // This will internally load all buffered ids (once) and return the one requested here:
        return $this->dataLoader->fetch($obj->someId);
    });
}

Other possible syntax:

'buffer' => function($obj) {
    $this->dataLoader->buffer($obj->someId);
},
'resolve' => function($obj) {
    return $this->dataLoader->fetch($obj->someId);
}

Technically it will require Executor algorith change from Depth-First to mix of Breath-First and Depth-First (requires researching and prototyping).

@mcg-web
Copy link
Collaborator

mcg-web commented Nov 4, 2016

I think that deferring syntax is the best. To keep flexibility a chaining system should also be implemented. I like the guzzle implementation of Promises/A+.

@mcg-web mcg-web closed this as completed Nov 4, 2016
@chrissm79
Copy link

@vladar @mcg-web Are promises possible in a non event loop environment? Or can a loop be created for each request? If we can do the latter, I would agree with @mcg-web that the Guzzle (or ReactPHP) implementation would be a great fit.

However, if this forces the rest of a project to run inside an event loop environment I think this would have to be something that is optional.

Regardless, I'm really excited to see the discussion of some sort of incorporated DataLoader getting started!

@vladar
Copy link
Member Author

vladar commented Nov 4, 2016

@chrissm79 No they are not possible directly, see #42 for full discussion on this subject. Yet we may adjust executor to postpone deferred values up until later point.

Here is a simple example:

{
  a {
    b
    deferred {
      c {
        d
        e
      }
    }
  }
  f
  deferred2 {
    h {
      i
    }
  }
}

Current executor runs fields serially: a, b, deferred, c, d, e, f, deferred2, h, i (depth-first)
New executor would evaluate them in following order: a, b, deferred, f, deferred2, c, d, e, h, i (mixed depth-first and breadth-first)

Actual code will be more complicated, but that should demonstrate the idea.

@mcg-web Why closing this?

@vladar vladar reopened this Nov 4, 2016
@mcg-web
Copy link
Collaborator

mcg-web commented Nov 4, 2016

@vladar sorry I push the wrong button. Dataloader should be implement in a external lib like Facebook js implementation. I open a PR #67 that could be a good start for this...

@vladar
Copy link
Member Author

vladar commented Nov 4, 2016

@mcg-web I don't quite understand how promises help us in case of sync code. My current understanding is that promises will only work for async environments. In sync environments they don't help too much: wait method simply blocks execution, but doesn't change the order in which fields are executed.

See example query above - how promises can help to achieve this order of field execution in sync environment? Sorry, maybe I am missing something.

@mcg-web
Copy link
Collaborator

mcg-web commented Nov 4, 2016

@vladar First you create a promise that add the object id to your dataloader, the wait method calls the dataloader to fetch the data, at that point dataloader have all needed ids so can easily regroup request... I'll write a small example for better explanation...

@mcg-web
Copy link
Collaborator

mcg-web commented Nov 4, 2016

here a fast example:

namespace GraphQL\Promise;

class DeferredPromise implements PromiseInterface
{
    private $callback;

    private $dataLoader;

    private $id;

    public function __construct($id, callable $callback)
    {
        $this->dataLoader = DataLoader::getInstance();
        $this->dataLoader->buffer($id);
        $this->id = $id;
        $this->callback = $callback;
    }

    /**
     * Waits until the promise completes if possible.
     *
     * @return mixed
     * @throws \LogicException if the promise has no wait function or if the
     *                         promise does not settle after waiting.
     */
    public function wait()
    {
        $object = $this->dataLoader->fetch($this->id);

        return call_user_func($this->callback, $object);
    }
}

@vladar
Copy link
Member Author

vladar commented Nov 5, 2016

It's not the deferred/promise interface or syntax that bothers me. I don't understand how Promises (any implementation) can help us to defer field execution in sync environment.

The only conceptual thing we need in sync environment is resolver returning some callable that will be called only after all other available fields are executed.

How this callable is implemented (wait method of some interface or wrapped Closure/Callback) - is a secondary question.

The main question is - how do we actually defer execution of this callable until completion of all other available fields?

That's what bothers me and I don't understand the value of Promises for this deferring process in sync environment. They just don't provide any help here (unless I am missing something).

My current understanding is that there are two (quite orthogonal) mechanisms:

  1. Promises in async environment do help to defer execution by leveraging underlying event-loop
  2. In sync environment you need to write custom mechanism for deferring fields (which won't get much help from promises)

So technically this issue is not about adding promises support, but about changing the algorithm of field execution in sync environments.

Does it make any sense?

@vladar
Copy link
Member Author

vladar commented Nov 5, 2016

After some thinking one idea I have on how to leverage Promises for sync environment - is by implementing our new executor algorithm as event loop (technically field-loop in our case).

Then in environments with native event loop (like ReactPHP) we can use promises directly, but in sync environments we will run custom field loop (which in turn will call wait method from your example).

I guess it is similar to what @ivome mentioned in this comment.

@shadowhand
Copy link

shadowhand commented Nov 15, 2016

Promises might allow for loading data in parallel, but won't help with deferring. A simple check against a Deferred type should be enough. Really, any implementation of JsonSerializable should be sufficient. Effectively:

public function jsonSerialize()
{
    return call_user_func($this->finalResolver, $this);
}

Where the finalResolver is the callback that executes the dataloader and returns the materialized result.

@vladar
Copy link
Member Author

vladar commented Nov 16, 2016

@shadowhand Simple implementation with JsonSerializable won't work. After calling resolve value is passed to nested object resolvers. In this case - your deferred value with no data, which will actually emit null for all nested fields.

@shadowhand
Copy link

@vladar oh I didn't mean to imply this would be all that would be required. There would definitely have to be detection for these deferred values before final object resolution. My point was more about the fact that promises don't help with deferred values.

@vladar
Copy link
Member Author

vladar commented Nov 16, 2016

@shadowhand Ah, agree %)

There is actually one way how Promises can potentially help. See discussion in #67

I have big doubts about it's convenience though, but need to play with it first before making conclusions.

@shadowhand
Copy link

After reviewing #67 and overblog/dataloader-php I am actually in favor of the promise approach. For me, async data loading that solves the N+1 problem is the significant blocker to adopting GraphQL for real world applications.

@vladar
Copy link
Member Author

vladar commented Dec 2, 2016

For anyone interested - we added ability to defer field execution in master. Still experimental phase, but all tests pass and it's time to play with it.

The syntax for standard sync mode is the one I described previously, namely:

'resolve' => function($obj) {
    $this->dataLoader->buffer($obj->someId);

    return new GraphQL\Deferred(function() use ($obj) {
       // This will internally load all buffered ids (once) and return the one requested here:
        return $this->dataLoader->fetch($obj->someId);
    });
}

It is fully backwards compatible and works in standard PHP versions (no need for ReactPHP or other async platforms to leverage this feature).

Technically it simply alters the order of field execution. Deferreds are skipped for resolution until all non-defered fields are resolved. It enables quite effective batching of queries.

Under the hood it uses Promises as main concept for deferring. Promises do help even in sync environment when combined with queues.

We could implement it without promises (by using field queue instead), but with promises we can leverage any async platform which supports them (like ReactPHP, HHVM, icicle.io, php threads, etc). Obviously the syntax will be different in this case comparing to sync mode (you will return promises of the platform of interest vs GraphQL\Deferred), but conceptually it will work the same way.

Thanks a lot to @mcg-web for contributing Promises support and tests!

@vladar
Copy link
Member Author

vladar commented Dec 19, 2016

Docs entry: http://webonyx.github.io/graphql-php/data-fetching/#solving-n1-problem

@vladar vladar closed this as completed Dec 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants