Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Break existing DSL to match gradle idioms #32

Closed
nedtwigg opened this issue Aug 28, 2016 · 6 comments
Closed

Proposal: Break existing DSL to match gradle idioms #32

nedtwigg opened this issue Aug 28, 2016 · 6 comments

Comments

@nedtwigg
Copy link
Member

First off, let's review how Spotless works. Each FormatTask has an Iterable<File>, and a list of FormatterStep. FormatterStep is really just a Function<String, String> - take an unformatted string, return a formatted string. So to create a new task, you specify the files you want to format, then the list of steps you want to take to format them.

Right now, the spotless DSL works like this:

spotless {
    // create a new FormatExtension, and delegate it to the closure
    format 'misc', {
        // specify the files you want to format
        target '**/*.md'
        // and the steps you want to apply, in order
        trimTrailingWhitespace()
        custom 'customStepName', {
            // some function which takes a string, and returns a string
        }
    }

Over in #31, @oehme made a suggestion to change the DSL to be more gradle idiomatic, which i'm moving here. A first proposal is this:

spotless {
    // define formats for applying to certain kinds of code
    formats {
        misc {
            extensions '.gradle', '.md'
            trimTrailingWhitespace true
            step('customStep') {
                  // imperative code here
            }
        }
        all {
            indentWith '\t'
        }
    }
    // target specific file trees to format
    targets {
        buildFiles {
            dir
        }
    }
}

There's three changes wrapped up in this:

  1. Separate the definition of a format, and the files which it applies to.
    • Looking at the buildscripts where Spotless is in use, there is a 1:1 between formats and the files they apply to. I think this adds a concept to learn without adding a feature users need.
  2. Rather than trimTrailingWhitespace() being a step which gets added, it becomes a property which gets set. Presumably if it is set to true then we add the step, and if it is not set to true then it doesn't get added.
    • This makes Spotless more declarative, and allows us to remove an empty pair of parentheses to better match the Gradle style, since Gradle DSL doesn't allow no-argument statements.
    • Unfortunately, function application does not commute. If you change the order that you apply rules, it will change the output, as happened here. Depending on what order you set the properties, or what order Spotless listens to the results of what you set, you'll get a different outcome.
    • Spotless currently "fibs" a bit. It looks a little declarative, but it's actually very imperative. Each "line" in a format adds a function to a list, and then those functions are applied in order. The nice part is that it's easy to describe this to users who need custom rules, and in the corner-cases where the order of formatting steps matters, it's easy to fix.
  3. By using gradle containers, rather than format 'misc', {}, we can now just say misc {}, which allows us to also use the all{} block to apply some logic to all the formats.
    • Pro: Definitely looks better.
    • Con: When does the 'all' block take effect? Does it run before the others or after? Does it depend on its order in the file relative to the other blocks?
    • Con: The distinction between a custom "format" block and a "magic" java/freshmark block is now less clear. How come I can call eclipseFormatFile if my block is called java but not javaSource?
    • Con: Breaks all existing custom scripts.
    • Con: Future contributors will have to know a little about gradle containers, which makes test code like this harder to write.

Imo, 1 & 2 do not improve Spotless. 3 definitely does, but the tradeoff is that we have to add the Gradle-specific container concept. Gradle has done a fantastic job of taking the "convention-over-configuration" idea, mixing it with a scripting language, and fostering a vibrant plugin ecosystem. But it's also, imo, made a few mistakes. I'm skeptical about whether the container idea adds enough value to be worth learning, and asking Spotless' users to learn.

Certainly there are ways to improve Spotless' design, but I'm hesitant to break existing user's builds unless it makes a big difference for future adopters. My ears are open, but I don't see a clear enough win to justify the cost.

@nedtwigg nedtwigg changed the title Proposal: Change DSL to match gradle idioms Proposal: Break existing DSL to match gradle idioms Aug 28, 2016
@oehme
Copy link

oehme commented Aug 28, 2016

Looking at the buildscripts where Spotless is in use, there is a 1:1 between formats and the files they apply to.

This is only true because spotless does not allow them to do differently in a convenient manner ;) It's all about more incremental builds: What you really want is to apply the java format to several targets (main and test for instance). You only want a target to be re-checked if one of the source files changed. The integration with the Java sourceSets would of course be done automatically, so users would not even need to write code for that, but benefit from faster builds.

Spotless currently "fibs" a bit. It looks a little declarative, but it's actually very imperative. Each "line" in a format adds a function to a list, and then those functions are applied in order. The nice part is that it's easy to describe this to users who need custom rules, and in the corner-cases where the order of formatting steps matters, it's easy to fix.

I did not think about the ordering issue, please scratch this proposal of mine :)

When does the 'all' block take effect? Does it run before the others or after? Does it depend on its order in the file relative to the other blocks?

It is the usual way to define defaults: When called, it applies to all elements already in the container, as well as all those added in the future. Similar things can be done with matching {} for instance. It would allow people to build convention plugins on top of spotless, so you don't need to add support for every exotic language to the core ;)

The distinction between a custom "format" block and a "magic" java/freshmark block is now less clear. How come I can call eclipseFormatFile if my block is called java but not javaSource?

Good point, that distinction is important. We can add a type and remove the magic meaning of 'java':

javaSource(JavaFormat) {
  //now it is clear why I have special methods
}

When no type is given, we use a standard format.

Con: Future contributors will have to know a little about gradle containers, which makes test code like this harder to write.

I don't understand how it would be harder to write. Also, contributors are likely to be Gradle users and they expect domain object containers. This is how all Gradle core DSLs work. You might not realize it, but configurations, sourceSets, tasks and many more are all domain object containers. Users can take what they know about one and write the same kind of code against the other. The current DSL of spotless feels alien compared to that.

@oehme
Copy link

oehme commented Aug 28, 2016

I would still automatically add the java format when the java plugin is applied though. You don't want users to create it from scratch and stating obvious things like "the file extension is .java" :)

But the type argument does add great new possibilities for extension. Someone could for instance add a special formatter for C++:

cpp(CppFormat) {
  //C++ specific methods here
}

@nedtwigg
Copy link
Member Author

I still don't understand when the all block executes.

formats {
    java {   indentWithTab()   }
    python { indentWithSpace() }
    all {    indentWithTab()   }
}

The python format will definitely end up with both indentWithSpace and indentWithTab rules. But will indentWithTab be the first, or the last rule? If I put the all block at the top, does that change?

cpp(CppFormat) { the type argument does add great new possibilities for extension

That's a cool feature. Obviously you understand gradle well, and I think you understand the issues I raised with the new DSL. If you or somebody else submits a PR with a working example of how JUnit should rewrite their build.gradle, and the new syntax is as clear or clearer, I'll merge and release. Not because I'm convinced it's a good idea, but because I trust the gradle team and sometimes I'm wrong.

Just for comparison:

cpp(CppFormat) {
    formatIncludes()
    // knowledge needed to write CppFormat:
    //     Spotless API
    //     Gradle domain object containers
}

Another way to extend

format 'cpp', {
    CppFormat.applyTo(this, {
        formatIncludes()
    })
    // knowledge needed to write CppFormat:
    //     Spotless API
    //     Function calls
}

contributors are likely to be Gradle users and they expect domain object containers

I'd be curious to know what percentage of the plugins in the plugin portal define a domain object container. I know the rust community has a mechanism to survey feature usage across their ecosystem... But regardless of what plugin developers know, certainly a survey of every person who ran gradlew build today would show that very few of them would know what a domain object container is. There are more java programmers than gradle plugin developers, why would I prefer targeting the subset when I can target the super?

Every ecosystem has suppliers and consumers. Idiomatic java handles an error case by throwing a checked exception. If you're supplying an API, checked exceptions let you concisely describe what can go wrong in your code. But if you're just consuming an API to ship an application and you need to call a method that throws an exception but your superclass doesn't let you rethrow, then statistically speaking you'll probably just swallow it or log it to a mystery stream. If you're a build engineer, domain objects enable unmatched brevity. But for people who are just trying to tinker with the build to make a small change, they traded a little verbosity for a lot of magic.

These are the things I like about gradle:

  • there's a small API for building the task graph which is easy to learn on the fly
  • I can manipulate that API with a real programming language, not some weird DSL (XML or otherwise)
  • the "apply plugin" model encourages convention-over-configuration while still letting me muck with that convention using a real programming language

These are the parts of gradle that I fight with:

  • The "real programming language" isn't the language that I use. It's another thing to learn. I wish it was just an API, without groovy or kotlin or anything else attached. Maybe a teensy bit of magic, such as "your build script is copied into public class Project{ public void init() { %your script here% } }.
  • Idiomatic gradle emphasizes a weird DSL at the expense of the real language underneath. If I wanted a purely-declarative DSL, I'd still be using Maven. I want a real programming language so I can express my needs with function calls and data structures - preferably statically-typed. You don't need a DSL to do declarative programming, and DSL's often hide imperative realities (as demonstrated by the ordering issue described earlier).

I see that domain objects allow another kind of convention-over-configuration, but I don't see them as an innate virtue. I still don't understand whether the "all{}" block will execute before or after the "misc{}" block. I don't understand if it applies to all blocks everywhere, or just the ones in the current project, or maybe just its children? And I don't understand how they're possibly simpler for my users than the existing DSL. But I'm wrong all the time, maybe I'm wrong here too.

@oehme
Copy link

oehme commented Sep 2, 2016

I still don't understand when the all block executes.

It is executed at the point where you write it. But it also adds a callback that will be executed for all future additions to the container. You generally use the all block first to set up some defaults. Or you might use it to register a callback for creating other tasks based on the configuration. Or maybe you just want to log some information. The point is you don't know what users might want to do with your configuration objects, so the domain object container provides all the flexibility they need out of the box.

A normal user wouldn't have to write this:

cpp(CppFormat) {
    formatIncludes()
}

The author of the spotless-cpp plugin would provide an opinionated version of his plugin which automatically creates the cpp format. The user can then refer to it just by name

cpp {
    formatIncludes()
}

But for advanced users, the spotless-cpp author could also provide an unopinionated base version of his plugin, which just provides the type, but doesn't actively configure anything. The advanced user could then create her own format as you showed above.

This is what often happens in build teams of bigger enterprises. They want to build their own conventions, without completely reinventing the wheel. Gradle makes this easy, as long as plugin authors follow the best practices.

I'd be curious to know what percentage of the plugins in the plugin portal define a domain object container. I know the rust community has a mechanism to survey feature usage across their ecosystem... But regardless of what plugin developers know, certainly a survey of every person who ran gradlew build today would show that very few of them would know what a domain object container is. There are more java programmers than gradle plugin developers, why would I prefer targeting the subset when I can target the super?

We neeed to look at two different target audiences: Plugin authors/build engineers who build plugins and conventions. And on the other side build users who just consume those plugins/conventions. The former will have a good understanding of these concepts and they are the ones building on top of your API. The latter need one thing first and foremost: Consistency. That's why all Gradle DSLs use the same patterns like domain object containers and that's why all plugins should follow these conventions. They make it easier both for integrators as well as the end user.

If you're a build engineer, domain objects enable unmatched brevity. But for people who are just trying to tinker with the build to make a small change, they traded a little verbosity for a lot of magic.

I think this is where we will have to just agree to disagree :) I think Gradle's use of the same patterns everywhere makes it easier for users trying to find their way around. Build scripts are easier to read when they contain less imperative-style code. You have other preferences and that's completely fine. Gradle allows you to use a more imperative coding style if that makes it a bette experience for you. But as a plugin author, I just want to encourage you to follow the examples set by the core plugins, so users will have less mental overhead when configuring your plugin.

I'm happy you are considering pull requests for this and I hope somebody will pick up the task. I will definitely provide help and feedback, so please don't hesitate to ask :)

@nedtwigg
Copy link
Member Author

nedtwigg commented Sep 3, 2016

Gradle's use of the same patterns everywhere makes it easier for users trying to find their way around

I agree with this principle, which is why (coupled with your experience) I'll take a PR even if I don't understand the advantages of a domain object container.

Build scripts are easier to read when they contain less imperative-style code. You have other preferences and that's completely fine.

I also prefer declarative code to imperative code, same as you. But a list is just as declarative as a set. If possible, it's best to make it so order doesn't matter, but often it does. And that's not always a bug - the fact that ordering matters is the only reason that a convention plugin can setup some defaults which can later be partially overwritten by a buildscript.

Domain object containers obscure ordering. Spotless has a declarative-looking DSL that's easy to read, but it is order-dependent. It is easier to write a Function<String, String> than a text processor which guarantees commutivity. So spotless looks declarative, and it mostly acts declarative too, but it sometimes leaks through that the order of the rules matters. And when that happens, it's easy for the user to figure that out and fix it if the code path which sets up those rules hasn't been obscured.

But since Spotless behaves mostly declaratively, I'll merge and release a PR even if I still don't know what this would do.

// I'm guessing that java ends up being indented by space, and python by tabs?  Maybe?
formats {
    java   { indent = tab   }
    all    { indent = space }
    python { indent = tab   }
}

@nedtwigg nedtwigg added this to the 3.x milestone Jan 3, 2017
@nedtwigg
Copy link
Member Author

nedtwigg commented Jan 4, 2017

It's about to be last call for this (3.0 is about ready). I don't understand the benefit, but I can be swayed by code :)

@nedtwigg nedtwigg removed this from the 3.x milestone Jan 20, 2017
@nedtwigg nedtwigg closed this as completed Mar 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants