Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Samples and documentation #4

Closed
fiLLLip opened this issue Oct 20, 2015 · 31 comments
Closed

Samples and documentation #4

fiLLLip opened this issue Oct 20, 2015 · 31 comments

Comments

@fiLLLip
Copy link
Contributor

fiLLLip commented Oct 20, 2015

Would love to see some documentation, maybe some samples on how to use the different endpoints.
I'm trying to use it like the official documentation states, but I get a lot of errors, and the way I manage to use some of it feels more like a hack than the right way to do it.

To get insert to work I had to do the following where obj is a custom class with properties:

var json = JsonConvert.SerializeObject(obj);
var keyValue = JsonConvert.DeserializeObject<Dictionary<string, string>>(json);
_r.db(dbName).table(tableName).insert(keyValue).run(_connection);

Putting the object directly resultet in an exception, same with the serialized string.

I have not yet managed to get getAll to function. If someone could lead me in the right direction, I could possibly write some documentation along the way while using it.

@bchavez
Copy link
Owner

bchavez commented Oct 20, 2015

Hey @fiLLLip ,

You're right... the driver right now a bit rough around the edges at the moment.

Something like this would be a lot less hack-ish:

_r.db(dbName).table(tableName).insert(obj).run(_connection);

There is a PR that adds POJO in the Java driver (in our case POCO) support rethinkdb/rethinkdb#4889 for a less-hackish use case; but it hasn't been merged yet by Josh (the main author for the official Java driver).

I'm kinda waiting to see what Josh does with this PR. Right now, Josh's efforts have been focused on getting 1000+ ReQL unit tests converted into Java to test the validity of the driver's operations.

Give me a few hours and I'll see if I can make your use case a little less hackish based on PR rethinkdb/rethinkdb#4889.

@bchavez
Copy link
Owner

bchavez commented Oct 21, 2015

Hi @fiLLLip ,

I've added some support for inserting and retrieving POCOs without hackish dictionary MapObject style conventions.

Here's an examples:

[Test]
public void insert_test()
{
    var obj = new Foo { id = "abc", Bar = 1, Jam = 2};
    var result = r.db(DbName).table(TableName).insert(obj).run(conn);
    result.Dump();
}

[Test]
public void insert_an_array_of_objects()
{
    var arr = new[]
        {
            new Foo {id = "a", Jam = 1, Bar = 1},
            new Foo {id = "b", Jam = 2, Bar = 2},
            new Foo {id = "c", Jam = 3, Bar = 3}
        };
    var result = r.db(DbName).table(TableName).insert(arr).run(conn);
    result.Dump();
}

[Test]
public void get_test()
{
    var foo = r.db(DbName).table(TableName).get("a").run<Foo>(conn);
    foo.Dump();
}

Both, insert and get and get should work. The implementation is _roughly_ based on PR rethinkdb/rethinkdb#4889.

Brief Technical Details and Usage for POCOs

This driver will mount the POCO object into the AST as a Poco AST object. When the driver begins serializing the AST, the POCO object will be converted when the build() method is called. The default converter is a static method call on Poco.DefaultConverter.

All the standard Newtonsoft [Json*] attributes will work out of the box. If this isn't enough and you want more control over the POCO serialization process, you can override the default POCO converter by setting the Poco.Converter static property before the driver boots up:

Poco.Converter = o =>
      {
          return JObject.FromObject(o, new JsonSerializer() {Formatting = Formatting.Indented});
      };
//then use the driver as normal.

Every time a POCO object needs to be converted, your custom converter function will be called.

This is all still cutting edge, so the final implementation may change based on @deontologician 's review of rethinkdb/rethinkdb#4889 in the Java driver, but I'll document any breaking changes as they occur.

RethinkDb.Driver 0.0.5-alpha2 with POCO support is now available on NuGet for DNX CoreCLR. I'll leave the this issue open until we have the final implementation for POJO support in the Java driver.

I hope this helps and I apologize for the inconvenience; we're still developing the the drivers and getting the rough edges out. Rock on 🚀.

@fiLLLip
Copy link
Contributor Author

fiLLLip commented Oct 21, 2015

Looks promising! I will give it a try 👍 And don't apologize for doing amazing work 😄

@bchavez
Copy link
Owner

bchavez commented Oct 21, 2015

Thanks @fiLLLip let me know how it goes.

Also, I was looking at getAll last night, and I think, it will be slightly more complex to fix. RethinkDB is letting us know a sequence is being returned and the driver is trying to return a Cursor that handles sequences; when actually, the generic return type is Cursor<Cursor<Foo>>.

I'll need a bit more time to experiment with the Java driver to determine if this is a bug that affects both drivers.

@deontologician
Copy link

I am coming back to work today, and I'm pretty close to having all the
tests fixed. The pojo patch is on my list next and my biggest concern is
the cursor implementation.

On Wed, Oct 21, 2015, 07:54 Brian Chavez [email protected] wrote:

Thanks @fiLLLip https://github.com/fiLLLip let me know how it goes.

Also, I was looking at getAll last night, and I think, it will be
slightly more complex to fix. RethinkDB is letting us know a sequence is
being returned and the driver is trying to return a Cursor that handles
sequences; when actually, the generic return type is Cursor<Cursor>.

I'll need a bit more time to experiment with the Java driver to determine
if this is a bug that affects both drivers.


Reply to this email directly or view it on GitHub
#4 (comment)
.

@bchavez
Copy link
Owner

bchavez commented Oct 21, 2015

@deontologician Yeah, I think the Cursor<T> implements a unique challenge here at least here in the C# driver.

C#, we can run: bool t = r.expr(true).run<bool>(conn); and get a strongly typed bool atom back. This is cool and good.

A problem occurs when we try to use a Cursor<T> with something like getAll: C#,

var all = r.db(DbName).table(TableName).getAll("a", "b", "c").run<Cursor<Foo>>(conn)

where the type parameter T in run call is Cursor<Foo>.

The driver reads the response from RethinkDB and detects a sequence is returned so res.IsSequence = true and the following problematic cast occurs:

internal virtual T RunQuery<T>(Query query)
{
 ....
    if( res.IsAtom )
    {
         ....
    }
    else if( res.IsPartial || res.IsSequence )
    {
        ICursor cursor = Cursor<T>.create(this, query, res);
        cursor.Extend(res);
        return (T)cursor;
    }

Problem is T is Cursor<Foo>, so we basically end up getting Cursor<Cursor<Foo>>. Compare this to Java's implementation:

@SuppressWarnings("unchecked")
<T> T runQuery(Query query) {
    .....
    if(res.isAtom()) {
      .....
    } else if(res.isPartial() || res.isSequence()) {
        Cursor cursor = Cursor.create(this, query, res);
        return (T) cursor;
    }

The Java call to Cursor.Create without having the item type information is a bit of a mystery to me. If this cast is really working in Java, my gut feeling is the Java compiler is doing some magic to infer the Cursor's _item_ type in Cursor<T> that we simply can't with C#.

Even if we omit the explicit Cursor in run<Cursor<Foo>>() to simply run<Foo>(), we must must return Foo in C#, not a derivation, collection, or Cursor of Foo.

It almost seems like C# will need a runCursor<T> if we want to carry the item's type information into Cursor<T>.

Another alternative is to delay the item's type information after the run<T> and remove the type parameter in Cursor<T> to simply Cursor, so run<Cursor>().OfType<T>().

Option A:

var all = r.db(DbName).table(TableName).getAll("a", "b", "c").runCursor<Foo>(conn)

Option B:

var all = r.db(DbName).table(TableName).getAll("a", "b", "c").run<Cursor>(conn).OfType<Foo>()

Option C:

var all = r.db(DbName).table(TableName).getAll("a", "b", "c").run<Cursor, Foo>(conn)

Not sure about all this. Need a bit more time to think about the issue and perform some experiments. 💭.

@fiLLLip
Copy link
Contributor Author

fiLLLip commented Oct 21, 2015

How about a fourth option D:

var all = r.db(DbName).table(TableName).getAll("a", "b", "c").run<Foo>(conn)

Here getAll is of a new type called ReqlAstCursor (or something) and returns a Cursor<Foo> all.
More specifically, T run<T>() is changed to Cursor<T> run<T>().

I have not researched if this would break any other desired behavior, something I suspect it does, but it does not break the desire to be as close to the official lib as possible.

A fifth option E, would be to always return as a Cursor<T> run<T>(), and it would be up to the api user to only use the first item when fetching via get(), and iterating when using getAll().

Could any of this be useful?

@deontologician
Copy link

That would be very different from the api of the other drivers, generally it's nice to be able to get a response unwrapped if it doesn't need to be. Conceptually, all that needs to happen is the cursor has to store how to deserialize each element. In java that's using the Foo.class object, I'm not sure if C# has the same kind of thing

@bchavez
Copy link
Owner

bchavez commented Oct 21, 2015

Ah, very nice options D and E as well... Thanks @fiLLLip .

Some thoughts on Option D...
I explored some code on Option D and I think the run methods in ReqlAst.cs would raise a compiler issue since we can't have methods with the same signature (but different return types):

ReqlAst.cs

public virtual T run<T>(Connection conn)
{
    return (T)conn.run<T>(this, new OptArgs());
}
public virtual Cursor<T> run<T>(Connection conn) //Compile Error in ReqlAst
{
        return //cursor...
}

Also, we can't override run with a different return types if we derived terms from ReqlAstCursor. I think for Option D we'd also need to identify which terms can return cursors and which terms return atoms so we could inherit from ReqlAstCursor super class.... And we'd also need to be aware of which terms could return both a atom objects and cursors (and then we'd need to separate them anyway to keep the compiler happy)... 💭

Thanks for your suggestions. Keep them coming 👍 _Any and all_ ideas help explore this issue fully.

@bchavez
Copy link
Owner

bchavez commented Oct 21, 2015

Ah, @deontologician gave me another idea. If the Cursor conceptually just deserializes, perhaps we can publicly avoid exposing the Cursor all together.

Option F:

var all = r.db(DbName).table(TableName).getAll("a", "b", "c").run<IEnumerable<Foo>>(conn)

But I'm not sure this is good either since there is some useful information in Cursor like .IsFeed. Also, we'd need a deeper integration with Newtonsoft to hook into to emulate a Cursor-IEnuemrable.

Another Option G:
Dig deeper into T from run<T> when run<Cursor<Foo>>, to pluck-out the _item_ Foo type. And perhaps, maybe using Activate to new up a real Cursor<Foo> instead of using Create<T>.create. Then the cast would be proper when returning T. My only hope is that Option G won't have too much of a performance impact reflecting deeper into T for the cursor's element type.

@bchavez
Copy link
Owner

bchavez commented Oct 21, 2015

Exploring Option G more in LinqPad yields a decent solution and allows us to still stay within the confines of the Java driver API regarding Cursor<T>s. We'd need to unseat DefaultCursor from a nested class in Cursor to keep this solution's complexity to a minimum but shouldn't be a big deal in the grand scheme of things.

void Main()
{
    var all = run<Cursor<Foo>>();
    all.Should().NotBeNull();
    all.GetType().Should().BeAssignableTo<Cursor<Foo>>();
}

// Run and pluck out the Cursor's T.
public static T run<T>()
{
    if ( /*res.IsPartial || res.IsSequence = */ true)
    {
        var type = typeof(T);
        if (type.GetGenericTypeDefinition() == typeof(Cursor<>) &&
            type.GenericTypeArguments.Length == 1)
        {
            var cursorItemType = type.GenericTypeArguments[0];
            return (T)Cursor.create2(cursorItemType);
                          // AND THE CAST IS VALID
        }
        else
        {
            throw new InvalidOperationException("The response is a sequence and 
                                                           should be ran with Cursor<T>");
        }
    }
    return default(T);
}

public class Cursor
{
    public static object create2(Type itemType)
    {
        var cursorWithItemType = typeof(DefaultCursor<>)
          .MakeGenericType(itemType);

        return Activator.CreateInstance(cursorWithItemType/*, conn, query, firstResponse*/);
    }
}

public class Cursor<T>
{
}
public class DefaultCursor<T> : Cursor<T>
{
}

public class Foo
{
    public int Bar { get; set; }
    public int Baz { get; set; }
}

If anyone doesn't have any objections, this might be the best way to go... thoughts? We'll just throw an exception if the RethinkDB response is a sequence and the user's return type something that was not a run<Cursor<..>>... The response is a sequence and should be ran with Cursor<T> 💭

@deontologician
Copy link

This won't work in Java since generic type arguments are not present at runtime. But I think you should do what's nice for C# rather than stick to restrictions on the Java driver

@bchavez
Copy link
Owner

bchavez commented Oct 21, 2015

@deontologician how would the Java driver work under this getAll query?

Cursor<Foo> all = r.db(DbName).table(TableName).getAll("a", "b", "c").run<Cursor<Foo>>(conn)

Would the statement above be the correct usage with the Java driver (and when RethinkDB returns a sequence)?

Option G here in the C# driver would maintain the same behavior. The quickest way out for us would be to have a dedicated runWithCursor<T>(conn) method that was dedicated to returning cursors, but I think that would be absent on the Java driver.

@deontologician
Copy link

I'm thinking something like

Cursor<Foo> all = r.db(DbName).table(TableName).getAll("a").run(conn, Foo.class);

Then we pass the extra param through to runQuery, and it hands it off to Cursor when it instantiates one. I think DefaultCursor should have a sibling that has this coercion stuff in it

@deontologician
Copy link

So, the meaning of .run(conn, Foo.class) would depend on what's returned. If it's an atom, then the result type will be Foo, if it's a cursor, then the result type will be Cursor<Foo>

@bchavez
Copy link
Owner

bchavez commented Oct 22, 2015

I see. One last question: If the response type is atom from getAll wouldn't the declaration of all need to change also?

Foo all = r.db(DbName).table(TableName).getAll("a").run(conn, Foo.class);

It seems like the user would need to know if it's an atom or sequence response ahead of time in order to determine the correct all type declaration.

Or would Cursor<Foo> all = ...run(conn,Foo.class); be compatible with an atom response? Almost seems like the initial declaration of Cursor<Foo> the same use as Foo.

@deontologician
Copy link

getAll always returns a cursor. Basically, yeah the shape of the returned results needs to be known by the user when writing the query. If you don't want a cursor you can modify the query like:

// get an list of Foo instead of a cursor
List<Foo> x = ...getAll("a").coerceTo("ARRAY").run(conn, Foo.class);
// get just the first element
Foo x = ...getAll("a").nth(0).run(conn, Foo.class);

@bchavez
Copy link
Owner

bchavez commented Oct 22, 2015

AWSOME. Thanks! This info helps a lot!

@bchavez
Copy link
Owner

bchavez commented Oct 22, 2015

😢 it looks like the reflection tax of Option G isn't worth it.

100,000,000 iterations

Option G run<Cursor<Foo>>: 146,458 ms
Option A runCursor<Foo>: 1,134 ms

I suppose if the Java driver expects the user to know the shape of the query results, I guess we could expect the user of the C# driver to call runCrusor<T> where appropriate.

@deontologician , is there any way to know which terms always return sequences and which terms always return atoms? like getAll always returning sequences? Or maybe an easy way to generate this kind of meta info?

@deontologician
Copy link

Unfortunately, we can't do a good enough job. RethinkDB is designed around a dynamic type system, so it's hard to really get the type at compile time without seriously limiting the kinds of queries you can do. A (contrived) example would be something like:

Object foo = r.table("foo").get("bar").coerceTo(r.table("baz").get("qux").getField("what_to_coerce_to"))

The result type of this query depends on what data is actually in your database. The programmer will know the constraints on the system and know that it's going to work out, but we can't necessarily encode those constraints into the static types available in Java and C#.

@fiLLLip
Copy link
Contributor Author

fiLLLip commented Oct 22, 2015

@bchavez To skip the need of reflection, but make the whole system a lot more complex, could we look into Option D? This means we have to map all functions that can return a Cursor<> or is a sequence into ReqlAstCursor which has the following run: Cursor<T> run<T>().
This also breaks the "one to one mapping" of the Java driver, and can add some complexity to maintenance...

@bchavez
Copy link
Owner

bchavez commented Oct 22, 2015

Thanks @deontologician for your help and guidance. It's invaluable at this stage of development.

I've spiked on Java's Erasure of Generic Types, it's very interesting and seems like it's a strong influence on the crux of the problem here in C# land. Feels like the C# compiler is imposing too much type safety to make this work nicely under one method run.

Just now thought of a new option.... in following the spirit of RethinkDB's dynamic type system.... another option worth exploring would be to introduce the DLR into the C# driver and go fully dynamic.

Option H

dynamic all = r.db(DbName).table(TableName).getAll("a", "b", "c").run(conn)

With Option H, all bets are off... we'd be fully dynamic at this point and it would be up to the user to figure out how to use the return types. The driver may return whatever it finds from the response, Foo (from atom), Cursor<Foo> (from sequence).

Thoughts? Option H might upset the apple cart with C# devs, but it's truly following the rest of the RethinkDB driver ecosystem.

@bchavez
Copy link
Owner

bchavez commented Oct 22, 2015

Hey @fiLLLip ,

In thinking about Option D we'd need:

  1. Maintain which AST terms can generate sequences and ones that generate atoms....
  2. Maintain two distinct term super classes ReqlAstAtom and ReqlAstCursor.
  3. Each super class having their own version of run() Atom) T run<T>() and Cursor) Cursor<T> run<T>().

My gut feeling is also telling me Option D might limit (and add complexity) to the expressiveness of ReQL in some way... especially when queries get very complex with lambda ReqlFunctions.

Also, my fear with Option D is the more differences we accumulate between C# and Java, the harder updating this driver will be.

@fiLLLip do you have any thoughts about going purely dynamic with Option H?

@bchavez
Copy link
Owner

bchavez commented Oct 22, 2015

Interestingly ... with Option H it looks like we can get the best of both worlds.

When we define the run method to return dynamic we can write:

Anything

dynamic all = r.db(DbName).table(TableName).getAll("a", "b", "c").run(conn)

Expect atom Foo as we do normally

Foo all = r.db(DbName).table(TableName).getAll("a", "b", "c").run<Foo>(conn)

And last but not least... _magical_: Expect Cursor<Foo> without run<Cursor<Foo>>()

Cursor<Foo> all = r.db(DbName).table(TableName).getAll("a", "b", "c").run<Foo>(conn)

💥 Compiles and WORKS! And, all will be typed as Cursor<Foo>!

In fact, _performance wise_, we're all in the same order of magnitude:

100,000,000 iterations

dynamic all = run<Foo>() // 3656 ms
var all = run<Foo>() // 3688 ms
Cursor<Foo> all = run<Foo>(); // 2930 ms !!!
Direct runCursor<T>(): // 1099 ms 

Amazingly, explicitly declaring the return type up front is even faster than raw dynamic when we know up front the driver's return type. 2930ms!

@fiLLLip
Copy link
Contributor Author

fiLLLip commented Oct 22, 2015

@bchavez That looks very promising 😄 Would it be possible to have both of the following in the API?

Cursor<Foo> all = run<Foo>(); // 2930 ms !!!
Direct runCursor<T>(): // 1099 ms 

I know it would break following the official API, but for the performance thirsty it runs roughly in 1/3 of the time of the dynamic typed, and some may favor that.

@bchavez
Copy link
Owner

bchavez commented Oct 22, 2015

@fiLLLip absolutely!

In fact, I think it's awesome we have a way to maintain behavior compatibility with Java's run method (and it's forthcoming documentation). At the same time, we can follow dynamicness of the RethinkDB Driver ecosystem, still keep complexity low, without the a huge performance cost of reflection.

And for those that really want performance and know what they're doing, they can use the type-safe runCrusor<T>() that avoids the DLR altogether and get that 1099 ms.

As long as we can be a strict super set of the Java driver's API and behavior I think we'll be in a really good position.

Awesome. I'll take a stab at it tomorrow. Really tired 💤 , but happy we have a promising solution to this.

@deontologician
Copy link

👍 this is awesome. I'm envious of the dynamic options you guys have!

@bchavez
Copy link
Owner

bchavez commented Oct 22, 2015

Okay. We now have a new Cursor implementation. I re-wrote the Cursor following a more .NET idiomatic approach with IEnumerable<T>. There are some key differences between Java's Iterator and .NET's IEnumerable<T>. Mainly, Java uses hasNext and getNext, where .NET uses bool MoveNext() and T Current. We just combine hasNext and getNext into bool MoveNext() in .NET.

Since we're using IEnumerable<T> we can now write:

[Test]
public void getall_test()
{
    Cursor<Foo> all = r.db(DbName).table(TableName).getAll("a", "b", "c").run<Foo>(conn);

    foreach( var foo in all )
    {
        Console.WriteLine($"Printing: {foo.id}!");
        foo.Dump();
    }
}
/*
Printing: b!
{
  "id": "b",
  "Bar": 2,
  "Baz": 2
}
Printing: a!
{
  "id": "a",
  "Bar": 1,
  "Baz": 1
}
Printing: c!
{
  "id": "c",
  "Bar": 3,
  "Baz": 3
}
*/

Nice. 👍

Additionally, we have the power of LINQ backing IEnumerable<T>, and the following now becomes possible:

[Test]
public void getall_with_linq()
{
    Cursor<Foo> all = r.db(DbName).table(TableName).getAll("a", "b", "c")
                       .run<Foo>(conn);

    var bazInOrder = all.OrderByDescending(f => f.Baz)
        .Select(f => f.Baz);
    foreach( var baz in bazInOrder )
    {
        Console.WriteLine(baz);
    }
}
/*
3
2
1
*/

Even more cool 👍

One word of warning though. I don't recommend using a Cursor underlying FEED with LINQ. LoL. For example, calling .ToList() on an underlying Cursor FEED, in theory, would make you wait forever as capturing a list of infinite items is impossible. Better to use Cursor<T>.BuffferedItems for something like that. :)

Also, we now have a faster runCursor<T> option for those who know the return type and want to avoid dynamic.

RethinkDb.Driver 0.0.5-alpha3 is now published for consumption. Let me know how it goes @fiLLLip :)

@fiLLLip
Copy link
Contributor Author

fiLLLip commented Oct 22, 2015

I tested them now, with and without Cursor<T>.BufferedItems, with and without the runCursor<T> and lastly casting directly to IEnumerable<T> and a combination of everything. No errors yet, and it seems to output the correct results! 👍

@bchavez
Copy link
Owner

bchavez commented Oct 22, 2015

Freaking awesoooooooooooome! 👍 😃

Going to work more unit tests and catch up to the java driver unit tests. After a bit I'll revisit anonymous type optArgs.

@bchavez
Copy link
Owner

bchavez commented Nov 4, 2015

Alrighty, we have some pretty good WIKI documentation going now.

Also, I've documented all the extended C# language features that we offer beyond the Java driver to enhance ReQL syntax:

I also pushed further with implicit override for native types:

//Objects inside Foobar table:
new Foobar {id = "a", Baz = 1, Qux = 1}
new Foobar {id = "b", Baz = 2, Qux = 2}

var exprA = r.table("foobar").get("a")["Baz"]; // 1
var exprB = r.table("foobar").get("b")["Qux"]; // 2
int add = (exprA + exprB + 1).run<int>(conn);
// Executes everything between ( ) on the server
// and returns result 4. Notice "+ 1)" is not executed on the client!

Kind of. Amazing. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants