-
Notifications
You must be signed in to change notification settings - Fork 7.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore code generation for language adaptors #204
Comments
Some background on why the current design was chosen rather than having each language with a separate version of Observable: The approach of having language specific packages/classes was pursued but did not work well because Rx is a composable library. It means that every time an For example ... From Java a library is exposed that has a method like this: rx.Observable getData() From Groovy a library is exposed with a method like: rx.groovy.GroovyObservable getOtherData() Then from Scala you need to wrap them again: rx.scala.ScalaObservable.from(getOtherData()) This means we have an To compose the two we would have: rx.scala.ScalaObservable.zip(rx.scala.ScalaObservable.from(
getOtherData()),
rx.scala.ScalaObservable.from(getData()),
... scala closure here ...); Now what does Should the above zip operator return If In short, for interop between languages it very quickly becomes a mess and our primary polyglot goal was that For this reason we chose the current language-adaptor model where it is made to understand the closure/lambda types of each of the languages so The reason we did not use an interface such as |
Playing with this the 3 approaches I can see are:
Options 1 and 2 are nice in that they only add the methods needed by the desired languages at runtime, but they are bad in that they both need an initialization hook - something that is a lot to ask for a plain library, not a full framework controlling entire lifecycle so I don't think they are appropriate here. Option 3 would include at compile time all of the methods needed for each language. This means that all runtimes would have that bytecode but if they don't have a particular language runtime loaded those methods could never be invoked. The benefit is that all languages would use static types for method dispatch. The downside is that all the overloads would always exist even if someone is using just one of the languages. A possible solution would be to generate different versions of the binary for each language - which would work fine if an environment is using only one language, but if more than one is being used it would be very confusing because someone couldn't just load the two separate binaries, they'd collide in the same namespace. A single binary with support for each language would need to be used. I think the drawbacks of having all overloads for all languages in a single binary are small enough that it's worth pursuing to see how it works. The only awkward drawback I know of that can't easily be overcome is that while working on the RxJava source itself in an IDE and running Groovy/Clojure/JRuby etc it would be hitting directly compiled classes, not the binaries put through bytecode generation so they would fail. It would mean that tests can only be run from the full build process, or by supporting option 1 or 2 for the development cycle while in an IDE. This impacts a limited number of people though so is not a big enough reason to avoid pursuing the bigger benefits of maintain static typing and avoiding use of instanceof for dynamically determining what language adaptor to use to execute a function. This is very exploratory and something I'm researching as I go (I generally do not like code generation or bytecode manipulation so have not used this approach before) so feedback is much appreciated. |
If you're going with option 3, why not directly add these methods in the source, instead of adding them after compilation? - Probably I'm missing something here. Anyway, instead of overloading the method with a single one where closures are mapped to Instead of public Subscription subscribe(final Object o) for all dynamic languages, you could introduce public Subscription subscribe(final groovy.lang.Closure<?> cls) and public Subscription subscribe(final org.jruby.RubyProc proc) and public Subscription subscribe(final clojure.lang.IFn cls) for each dynamic language. It wouldn't even be absolutely necessary to support Scala directly, as Scala can always hook itself in via implicits. Now, undoubtedly that's a lot of work, but I don't think you can get around that anyway. - But probably I'm overlooking something which makes this impossible? - The only big drawback I can think of atm is that it's not possible in that way to add another JVM-based language without touching the core. This doesn't worry me much, but maybe that's just me. Also, there would be another "easy way out". You probably already thought of that. It's not pretty, though: Keep the methods taking objects, but instead of overloading the typesafe methods, name those differently. For example, pair public Subscription subscribe(final Object o) with public Subscription typedSubscribe(final Action1<T> onNext) Though it's certainly not ideal, I could live with that... |
We could add all of the overloads directly to the source - but we're talking about 48 methods already and eventually probably double that and then multiplying that by the number of languages. It makes the javadoc, source code and maintenance rather ridiculous and it becomes very error prone that every time a method is added or touched the overload for every language must also be done. It would severely impede people getting involved and touching the Observable.java class. For these reasons I feel it's not a great approach - though it is a valid one. I also strongly want to avoid different names - as that breaks the API across languages and doesn't really solve the real problem as people now need to know the different naming conventions in order to stay "type safe". The amount of code to generate the methods is actually quite small, here's a prototype of me playing with it: public class LanguageAdaptorCodeGenerator {
static ClassPool pool = ClassPool.getDefault();
public static void addSupportFor(Class<?>[] types) {
CtClass cc;
try {
cc = pool.get(rx.Observable.class.getName());
} catch (NotFoundException e) {
throw new RuntimeException("Failed to add language adaptor methods as could not find rx.Observable.", e);
}
try {
for (CtMethod method : cc.getMethods()) {
CtClass[] argTypes = method.getParameterTypes();
boolean containsFunctionType = false;
for (CtClass argType : argTypes) {
if (isRxFunctionType(argType)) {
containsFunctionType = true;
}
}
if (containsFunctionType) {
try {
addMethod(cc, method, types);
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException("Failed to add language adaptor method: " + method.getName(), e);
}
}
}
// will need to output to filesystem here
//cc.toClass();
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException("Failed to add language adaptor methods.", e);
}
}
private static boolean isRxFunctionType(CtClass type) throws Exception {
// look for the Function argument
if (type.getName().equals(Function.class.getName())) {
return true;
}
// look for arguments implementing Function
for (CtClass implementedInterface : type.getInterfaces()) {
if (implementedInterface.getName().equals(Function.class.getName())) {
return true;
}
}
return false;
}
private static void addMethod(CtClass ctClass, CtMethod method, Class<?>[] types) throws Exception {
for (Class<?> cls : types) {
ArrayList<CtClass> parameters = new ArrayList<CtClass>();
CtClass[] argTypes = method.getParameterTypes();
System.out.print(">>>>>> method: " + method.getName() + " args: ");
for (CtClass argType : argTypes) {
System.out.print(argType.getName() + " ");
if (isRxFunctionType(argType)) {
// needs conversion
parameters.add(pool.get(cls.getName()));
} else {
// no conversion, copy through
parameters.add(argType);
}
}
System.out.println("");
CtClass[] newArgs = parameters.toArray(new CtClass[parameters.size()]);
String params = "";
for (CtClass a : newArgs) {
params += a.getName() + " ";
}
System.out.println("Add method => " + method.getName() + " => " + params + " --> with arg type converted to: " + cls.getName());
CtMethod newMethod = new CtMethod(method.getReturnType(), method.getName(), newArgs, ctClass);
newMethod.setModifiers(method.getModifiers());
newMethod.setBody("{ System.out.println(\"hello world!\");return null; }");
ctClass.addMethod(newMethod);
}
}
} Here are the methods that get overloads (for just Groovy in this output):
I haven't yet made the method body do the right thing (it's printing hello world right now), but it's basically taking what is already in the language adaptors and just inlining it. The part I haven't figured out yet is hooking into the Gradle build process - but it's just busy work I need to spend time on. |
Awesome. - I expected a lot more work to be necessary to generate those additional methods. But this looks quite manageable. You're right, the code generation approach looks better in this case than manually adding all those methods. |
Another option a colleague suggested is to generate the source code before compilation instead of doing bytecode manipulation after. Generating source code isn't as "clean" as using Javassist like that code above, but the benefit would be that source code is then shipped that matches the bytecode so that stacktraces and clicking through to source goes to the right place when debugging. That may be a strong enough value to do source code generation rather than bytecode generation. |
- This allows for type safety in statically-typed languages - This prevents dynamically-typed languages from hooking into rxjava-core. -- See ReactiveX#204 for details on code generation for dynamic languages * Added Scala implicits into rx.lang.scala.RxImplicits * Added tests of most methods on Observable using Scala functions * Fixed Scala Gradle/ScalaTest build
I've got some work-in-progress up on my fork at: mattrjacobs@f5f84a8. The commit log describes what's been done and what's left to do before I'll feel comfortable submitting it. |
Would it be possible to use the @deprecated annotation for the Object versions? This would cause java/scala to atleast warn when we are accidentally using the Object overload but might not translate out to the dynamic languages. |
We are getting close to eliminating these methods (week or two more at most On Tue, Jul 2, 2013 at 9:55 PM, samhendley [email protected] wrote:
Ben Christensen - API Team |
After implementing and throwing away a few different approaches we have landed on a solution we feel will balance the various competing priorities. Our goals are:
The solution we have arrived at will work as follows:
For example: The default Java version: public static <T> Observable<T> filter(Observable<T> that, Func1<T, Boolean> predicate) A Groovy version: public static <T> Observable<T> filter(Observable<T> that, groovy.lang.Closure predicate)
A project will include just the jar that meets their language needs, there will no longer be a "core" jar plus the language adaptor. The drawback of this is that mixing two of these in a classpath will result in non-deterministic loading (whichever is loaded last wins) and that is the version that will be used. This means if a library depends on rxjava.jar but is using Groovy and needs rxjava-groovy.jar it is up to the developer of that project to make sure they have only the rxjava-groovy.jar version. This is not ideal but is a one-time pain setting up a build and is better than the constant pain of missing static typing or converting to/from different Observable implementations for different languages.
This should not break any code but will require a slight change to the build dependencies in your project when we release this. We hope that this enables the RxJava project to better achieve its goal of being polyglot and targeting the JVM in general and not any specific languages without sacrificing interop or idiomatic usage in each of the languages. |
Byte code generation ended up not being used. We instead went with the solution in #323 |
Need these until we finish work at ReactiveX#204
Need these until we finish work at ReactiveX/RxJava#204
RxJava was written from the beginning to target the JVM, not any specific language.
As a side-effect of Java not having lambdas/clojures yet (and other considerations), Netflix used dynamic languages with it predominantly for the year of its existence prior to open sourcing.
To bridge the rxjava-core written in Java with the various languages a FunctionalLanguageAdaptor is registered at runtime for each language of interest.
To enable this language adaptors method overloads with
Object
exist in the API sinceObject
is the only super-type that works across all languages for their various implementations of lambdas and closures.This downside of this is that it breaks static typing for Java, Scala and other static-typed languages. More can be read on this issue and discussion of the subject here: https://groups.google.com/forum/#!topic/rxjava/bVZoKSsb1-o
I would like to pursue the following:
Object
overload methods from rxjava-core so it remains statically typedrx.Observable
for the types needed by that languages (such asgroovy.lang.Closure
orclojure.lang.IFn
).The benefits of this are:
Object
which then has to do a lookup in a map. Memoization helped with the performance but each method invocation still required looking in a map for the correct adaptor. With this approach the appropriate methods will be compiled into therx.Observable
class to correctly invoke the right adaptor without lookups.The byte code generation step would happen once the first time a LanguageAdaptor is registered which is normally only at the first point of access to RxJava.
The text was updated successfully, but these errors were encountered: