Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement generic number literals #6919

Merged
merged 5 commits into from
Aug 30, 2019

Conversation

odersky
Copy link
Contributor

@odersky odersky commented Jul 23, 2019

Example:

val x: BigInt = 111111100000022222222222
  • Also works for user defined types.
  • Conversion can be done at compile time, with customizable errors.

@propensive
Copy link
Contributor

I'm very happy to see this! Are there plans to provide support for pattern matching too?

@odersky
Copy link
Contributor Author

odersky commented Jul 24, 2019

I'm very happy to see this! Are there plans to provide support for pattern matching too?

I have not thought about it. Do you have a proposal how to do that?

@odersky odersky force-pushed the add-generic-literals branch from bdeda22 to 77e9179 Compare July 24, 2019 12:03
@odersky
Copy link
Contributor Author

odersky commented Jul 24, 2019

Regarding pattern matching, if the expansion of a literal creates something matchable (such a case class instance) we are good. The problem is what to do if that's not the case. Right now, we treat an application f(e) in a pattern always as a constructor pattern that is decomposed with unapply. We would need a syntax where it gets treated as a plain function call instead. I.e. something like:

  case {f(e)} => ...

If we have that, the inline expansion of a number literal can add the braces (or whatever) to make it work. So the problem is really more general than just literals.

@propensive
Copy link
Contributor

propensive commented Jul 24, 2019

I had only ever thought about it casually, but the case {f(e)} => ... syntax looks interesting. I'm yet to discover whether Scala 3 has sufficient macro support for me to reimplement Kaleidoscope, to allow pattern matching on regular expressions, e.g.

str match {
  case r"$firstName@([a-zA-Z]*) $lastName@([a-zA-Z]*)" => (firstName, lastName)
  case _ => ("Bad", "Name")
}

(which was probably the most obscure feature I ever used in Scala 2.)

but it's quite high up my list of desirable features. Relatedly, I also often thought it would be nice to be able to explicitly specify types to extractors, like so:

str match {
  case As[Int](int) => int
  case _ => 0
}

I'll think about the general idea some more.

@odersky odersky force-pushed the add-generic-literals branch from 77e9179 to 5a3f872 Compare July 24, 2019 14:17
@odersky
Copy link
Contributor Author

odersky commented Jul 25, 2019

It turns out that the idea to treat contents of blocks in patterns as expressions works with some minor tweaks. So this means we support generic literals as patterns now. So far, this is all internal; we do not support {...} as surface syntax yet. Whether we should do this is another discussion to have.

@propensive
Copy link
Contributor

That sounds like it can open up a whole variety of interesting possibilities. Thank you!

Evaluating this expression throws a `NumberTooLarge` exception at run time. We would like it to
produce a compile-time error instead. We can achieve this by tweaking the `BigFloat` class
with a small dose of meta-programming. The idea is to turn the `fromDigits` method
of the into a macro, i.e. make it an inline method with a splice as right hand side.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of the...?

@propensive
Copy link
Contributor

@odersky Do you think there is any use in tweaking the parser to support multiple adjacent .s in number literals, in order to provide nicer syntax for integer ranges?

for(i <- 1..100) println("")

This would obviate to, and the .. would have higher precedence than the alphanumeric infix method, which would be more convenient in a few cases.

@odersky
Copy link
Contributor Author

odersky commented Jul 26, 2019

@odersky Do you think there is any use in tweaking the parser to support multiple adjacent .s in number literals, in order to provide nicer syntax for integer ranges?

I like to and until. .. by itself is not enough since it does not let you do half open intervals.

@odersky
Copy link
Contributor Author

odersky commented Jul 26, 2019

@anatoliykmetyuk @nicolasstucki Can you figure out why the CI does not pass? Everything works fine locally.

@anatoliykmetyuk
Copy link
Contributor

anatoliykmetyuk commented Jul 26, 2019

@odersky toExpr has changed its signature, the correct implementation is:

  given as Liftable[BigInt] {
    def toExpr(x: BigInt): given (qctx: QuoteContext) => Expr[BigInt] =
      '{BigInt(${x.toString.toExpr})}
  }

It should also fail locally. However, I encountered troubles recompiling the PR: even after doing all:clean, I had some broken classes with wrong tasty signatures that were not recompiled. rm -rf out/bootstrap/ helped me; I think if you do rm -rf out/ there's a chance it will also start failing locally for you.

@nicolasstucki
Copy link
Contributor

Or slightly more compact

  given as Liftable[BigInt] {
    def toExpr(x: BigInt) = '{ BigInt(${x.toString.toExpr}) }
  }

@nicolasstucki
Copy link
Contributor

May also need a rebase to have the 0.17.0-RC1 reference compiler.

Copy link
Contributor

@nicolasstucki nicolasstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

title: Numeric Literals
---

In Scala 2, numeric literals were confined to the promitive numeric types `Int`, Long`, `Float`, and `Double`. Scala 3 allows to write numeric literals also for user defined types. Example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In Scala 2, numeric literals were confined to the promitive numeric types `Int`, Long`, `Float`, and `Double`. Scala 3 allows to write numeric literals also for user defined types. Example:
In Scala 2, numeric literals were confined to the primitive numeric types `Int`, Long`, `Float`, and `Double`. Scala 3 allows to write numeric literals also for user defined types. Example:

The companion object of `BigFloat` defines an `apply` constructor method to construct a `BigFloat`
from a `digits` string. Here is a possible implementation:
```scala
object BigFloat extends App {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
object BigFloat extends App {
object BigFloat {

case digits =>
'{apply($digits)}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
} // end BigFloat

@@ -703,7 +767,8 @@ class Typer extends Namer
(index(stats), typedStats(stats, ctx.owner))

def typedBlock(tree: untpd.Block, pt: Type)(implicit ctx: Context): Tree = track("typedBlock") {
val (exprCtx, stats1) = typedBlockStats(tree.stats)
val localCtx = ctx.retractMode(Mode.Pattern)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did this Mode.Pattern come from?
Was it from the pattern of something like?

bigFloat match {
  case 123_344_537_244_453E433 => ???
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exacty. We might make it user-accessible syntax though.

inferImplicit(fromDigitsCls.typeRef.appliedTo(target), EmptyTree, tree.span) match {
case SearchSuccess(arg, _, _) =>
val fromDigits = untpd.Select(untpd.TypedSplice(arg), nme.fromDigits).withSpan(tree.span)
val firstArg = Literal(Constant(digits.toString))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val firstArg = Literal(Constant(digits.toString))
val firstArg = Literal(Constant(digits))

@odersky odersky force-pushed the add-generic-literals branch from 6bacde0 to f1a26bd Compare July 30, 2019 15:54
@odersky
Copy link
Contributor Author

odersky commented Aug 2, 2019

The main open issue is what the right typeclass infrastructure should be.

Question 1: Should FromDigits and FromString be merged?

Merging is attractive because it simplifies the design but in summary I believe the two classes are better kept separate. The contract for FromDigits is different since we only need to be ready to parse strings that are numbers. That means implementations of FromDigits can dispense with a number of checks that implementations of FromString have to perform.

Question 2: Should we collapse the four classes FromDigits, FromDigits.WithRadix, FromDigits.Decimal and FromDigits.Floating into a single one?

Again, we trade off conceptual simplicity for ease of implementation. Here the choice actually does make a difference. Say, we have class FooNum that implements FromDigits but not FromDigits.Decimal. We write

val x: FooNum = 2.0

Under the current 4-class scenario, 2.0 would be treated as a floating point number. The program might still typecheck if there is an implicit conversion from Doubles to FooNums.

Under a collapsed scenario, 2.0 would be treated as a FooNum and the conversion would give an error.

So the current 4-class scheme is closest to the status quo; it only treats a literal as a user defined type if the user defined type has declared that it accepts that kind of literal.

@odersky
Copy link
Contributor Author

odersky commented Aug 2, 2019

In summary, I propose to stick with the status quo and merge it now. We can still change it later if different arguments come to light.

@sjrd
Copy link
Member

sjrd commented Aug 2, 2019

I'm not convinced by this feature.

For starters, what is really the motivation?

The user-defined literals are expected-type-driven. This means that they work in some cases, but not in others. In particular:

  • Somehow this works for pattern matching, but I don't see it working with ==, i.e., I don't see how someCustomNumberValue == 5 is going to do the right thing.
  • It will probably work at the right-hand-side of operators, such as someCustomNumber + 5, because the expected type of def +(arg) will be CustomNumber, but it won't work at the left-hand-side of operators.

Can I have actual constant expressions of type Byte and Short with this feature? Currently it is impossible because 5.toByte is not a constant expression, and (5: Byte) either, and that impossibility is a known problem in Scala.

Why does this work for numeric literals, but not for other literals? In particular, could I recover symbol "literals" from string literals with something like this?

Why is this considered when we already have string interpolators and we could write bi"10_000_000_000" instead?

Why is this feature considered morally better than implicit conversions, which are basically deprecated?

@odersky
Copy link
Contributor Author

odersky commented Aug 2, 2019

i.e., I don't see how someCustomNumberValue == 5 is going to do the right thing.

Just define an overloaded equality == in SomeCustomNumber.

It will probably work at the right-hand-side of operators, such as someCustomNumber + 5, because the expected type of def +(arg) will be CustomNumber, but it won't work at the left-hand-side of operators.

Yes, that looks like a hard limitation.

Can I have actual constant expressions of type Byte and Short with this feature?

No. Primitive number types are treated as before. But 5.toByte can already now be a constant expression if toByte is an inline method.

bi"10_000_000_000"

Simplicity. I find that string interpolator really ugly. It's one of the clear advantages of languages with unlimited precision numbers that you can define numbers without artificial restrictions. Let's face it: the whole way we define numeric literals with suffixes and so on is artificial and unnatural. A literal is just what it is. It is given meaning by ascribing a type.

@sjrd
Copy link
Member

sjrd commented Aug 2, 2019

Just define an overloaded equality == in the type of someCustomerValue.

That has the same problem with the left-hand-side: 5 == someCustomNumberValue will not be equivalent to someCustomNumberValue == 5.

Let's face it: the whole way we define numeric literals with suffixes and so on is artificial and unnatural.

I disagree. The whole way we have different types of numbers with differing precisions is artificial and unnatural. It goes against math. But given that we have different types of numbers with different behaviors (based on ad hoc polymorphism to boot!), I'm very happy we have to use explicitly typed literals. They prevent potential warts.

@odersky
Copy link
Contributor Author

odersky commented Aug 2, 2019

I believe the examples with 5 as the number are all misleading. We already know how to deal with small numbers and custom types using implicit conversion. I.e.

5 + myBigInt

and

myBigInt + 5

both work and can be made to work by library code for arbitrary number types.

So the main improvement that this PR brings is that we can write large number literals which are not constrained by the ranges of existing number types. However, in practice you'd rarely write a large number literal inline. Common practice recommends that you define a constant instead. So, with this proposal it's

val x: Int = 100
val y: Long = 10_000_000_000
val z: BigInt = 10_000_000_000_000_000_000_000

instead of the status quo:

val x: Int = 100
val y: Long = 10_000_000_000L
val z: BigInt = bi"10_000_000_000_000_000_000_000"

I find the first much more pleasant and regular than the second. The limitations when you can or cannot use a really large number inline in an expression are very similar to mixing existing small number literals and custom types. There is as far as I can see only one restriction for large literals:

50_000_000_000_000_000_000_000 + myBigInt

will not work. It works for small literals on the left since there are implicit conversions from Int and Long to { + : BigInt => ? }. The conversion trick does not extend to typing literals. But overall I think that's a small restriction and we can live with it.

@odersky
Copy link
Contributor Author

odersky commented Aug 2, 2019

I should also note that it's not my idea. I got this from Guy Steele, who proposed this some time ago.

@smarter
Copy link
Member

smarter commented Aug 2, 2019

An extra concern in Scala is that so far, all literals have a corresponding singleton literal type (symbol literals are deprecated so I'll ignore them), but with this proposal, this would no longer be true. It's not necessarily a deal-breaker but I think it's worth exploring alternative designs that would support singleton types.

@denisrosset
Copy link

Amazing idea. Spire needs a lot of complex/hackish code to help people write mathematical like notation and it works 85% of the time (especially with all the Int implicit conversions already in the stdlib).

I'll take a closer look later on this.

@odersky
Copy link
Contributor Author

odersky commented Aug 7, 2019

Note that both Java and Scala already do this to some degree: An Int literal is treated as a Byte, Short, or Char if the expected type is one of these and the literal fits the range. This is not an implicit conversion (there is none between Int and Byte or the other types), but a specialized treatment of literals. In a sense the proposal is to take that idea and apply it to user defined number types.

@abgruszecki
Copy link
Contributor

abgruszecki commented Aug 7, 2019

It seems to me that this feature simply adds more irregularity to the language than it gets rid of. The current status quo, as I understand it, is either defining a string interpolator and using it everywhere, or the complex/hackish implicit conversions that @denisrosset mentioned. Since I do not actually write any code that would use this feature, I will refrain from having any opinion, but I will point out what will (still) not work in the current version of the proposal:

Equality comparisons with polymorphic literals will not work. If we're handling math vectors, then we may want to treat 1 and 0 as vectors. Then something like val vec: Vec3 = 1 would work, but vec == 1 would fail, no matter what we do - the equals(Any) method is always available, and will force the polymorphic literal into Int according to the proposed rules.

In general, binary operators will not work. If I defined a custom MyBigInt that can only be multiplied by other MyBigInts, then doubling it like 2 * myBigInt will fail. In math notation, we'd write 2a, and I am a sample person that would prefer to write 2 * a instead of a * 2 in code as well.

Defining constants with expressions will not work:

val tau: BigFloat = 2 * 3.1415926535897932384626433832795028841971

That is, the "natural" way of defining tau fails to compile. Even worse, since the RHS will default to Double if it is small enough, expressions like that may on occasion compile and produce an imprecise result.

Two notable languages which do have polymorphic literals are Haskell and Rust. However, both of them also have complete local type inference, which avoids all of the above problems and makes the feature work that much better.
Polymorphic literals are particularly pleasing in Rust, which differentiates between signed/unsigned and 32/64-bit integers, but where we do not need to care about that at all when using local literals:

fn print(i: i64) {
    println!("{}", i)
}

fn main() {
    let is = vec![1, 2, 3];
    for i in is {
        print(i);
    }
}

No matter whether print expects an i64 or an u32 or anything else, this code will compile (note that the syntax for monomorphic literals in Rust is 1u32). See it here. Similar code works just as well in Haskell.

@rjolly
Copy link
Contributor

rjolly commented Aug 10, 2019

Regarding equality comparison, we have to look how it plays with multiversal equality. Regarding non-commuting operators, something that would be really useful is if we could define a right-associative operator with the colon notation:
def +:(that: BigInt)
, and use it without the colon, as in 1 + x instead of 1 +: x.

@propensive
Copy link
Contributor

Note that both Java and Scala already do this to some degree: An Int literal is treated as a Byte, Short, or Char if the expected type is one of these and the literal fits the range. This is not an implicit conversion (there is none between Int and Byte or the other types), but a specialized treatment of literals. In a sense the proposal is to take that idea and apply it to user defined number types.

As a suggestion for educational materials, instead of using the phrase "Int literal", could we call it an "integral literal" to indicate that it's just as much an Int literal as a Byte literal and a Short literal and that, in the absence of any other expected type, the compiler will default to typing it as an Int?

@denisrosset
Copy link

@odersky What is the merge timeline for this? I'd love to try this on a "minispire" library and provide feedback, but I'm quite overwhelmed by academic collaborations at the moment.

@odersky
Copy link
Contributor Author

odersky commented Aug 16, 2019

@denisrosset I'd like to merge this by the end of the month, at the latest. But even after merge it still has to be discussed in the SIP process.

In an application with named and default arguments, where arguments
have side effects and are given out of order, and some arguments are
missing, the previous algorithm worked only if typed and untyped argument
trees were the same.

Test i2916 started failing once literals were represented as Number trees,
since then untyped and typed versions of the argument were different.
But avoid inlining:

 - if the typechecking the body to inline generated errors
 - if checking inlined method generated errors
 - if we are in an inline typer and the same inline typer
   already generated errors.

As part of this commit, merge hasBodyToInline and bodyToInline.
An erroneous body can suppress inlining by returing an EmptyTree.
### Example:

```
val x: BigInt = 111111100000022222222222
```

### Also allow generic literals in patterns

Wrap generic literals in patterns in blocks, which force
expression evaluation. This allows to match (say) a scrutinee
of BigInt type against large numeric literals. But it does
not work if the scrutinee is of type `Any` because then
the expeected type for the pattern is missing.

This would be addressed by a feature that's still missing: patterns
of the form `<literal> : <type>`. We have to change the syntax of patterns
in general for this one.
@odersky odersky force-pushed the add-generic-literals branch from f45af02 to a276f0e Compare August 30, 2019 09:16
@odersky odersky force-pushed the add-generic-literals branch from 8af7117 to 93291a4 Compare August 30, 2019 10:10
as suggested in scala#7134 by @vn971
@odersky odersky merged commit 5c952bd into scala:master Aug 30, 2019
@odersky odersky deleted the add-generic-literals branch August 30, 2019 10:57
@anatoliykmetyuk anatoliykmetyuk added this to the 0.18 Tech Preview milestone Aug 30, 2019
@liufengyun liufengyun mentioned this pull request Aug 14, 2020
7 tasks
liufengyun added a commit to dotty-staging/dotty that referenced this pull request Jan 7, 2021
In Scala 2, a typed pattern `p: T` restricts that `p` can only be a
pattern variable.

In Dotty, scala#6919 allows `p` to be any pattern, in order to support
 pattern matching on generic number literals.

This PR aligns the syntax with Scala 2 by stipulating that in a typed
pattern `p: T`, either

- `p` is a pattern variable, or
- `p` is a number literal
liufengyun added a commit to dotty-staging/dotty that referenced this pull request Jan 7, 2021
In Scala 2, a typed pattern `p: T` restricts that `p` can only be a
pattern variable.

In Dotty, scala#6919 allows `p` to be any pattern, in order to support
 pattern matching on generic number literals.

This PR aligns the syntax with Scala 2 by stipulating that in a typed
pattern `p: T`, either

- `p` is a pattern variable, or
- `p` is a number literal
dwijnand pushed a commit to dwijnand/scala3 that referenced this pull request Oct 7, 2022
In Scala 2, a typed pattern `p: T` restricts that `p` can only be a
pattern variable.

In Dotty, scala#6919 allows `p` to be any pattern, in order to support
 pattern matching on generic number literals.

This PR aligns the syntax with Scala 2 by stipulating that in a typed
pattern `p: T`, either

- `p` is a pattern variable, or
- `p` is a number literal
little-inferno pushed a commit to little-inferno/dotty that referenced this pull request Jan 25, 2023
In Scala 2, a typed pattern `p: T` restricts that `p` can only be a
pattern variable.

In Dotty, scala#6919 allows `p` to be any pattern, in order to support
 pattern matching on generic number literals.

This PR aligns the syntax with Scala 2 by stipulating that in a typed
pattern `p: T`, either

- `p` is a pattern variable, or
- `p` is a number literal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants