forked from python/peps
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpep-0515.txt
215 lines (155 loc) · 6.6 KB
/
pep-0515.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
PEP: 515
Title: Underscores in Numeric Literals
Version: $Revision$
Last-Modified: $Date$
Author: Georg Brandl, Serhiy Storchaka
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Feb-2016
Python-Version: 3.6
Post-History: 10-Feb-2016, 11-Feb-2016
Abstract and Rationale
======================
This PEP proposes to extend Python's syntax and number-from-string
constructors so that underscores can be used as visual separators for
digit grouping purposes in integral, floating-point and complex number
literals.
This is a common feature of other modern languages, and can aid
readability of long literals, or literals whose value should clearly
separate into parts, such as bytes or words in hexadecimal notation.
Examples::
# grouping decimal numbers by thousands
amount = 10_000_000.0
# grouping hexadecimal addresses by words
addr = 0xCAFE_F00D
# grouping bits into nibbles in a binary literal
flags = 0b_0011_1111_0100_1110
# same, for string conversions
flags = int('0b_1111_0000', 2)
Specification
=============
The current proposal is to allow one underscore between digits, and
after base specifiers in numeric literals. The underscores have no
semantic meaning, and literals are parsed as if the underscores were
absent.
Literal Grammar
---------------
The production list for integer literals would therefore look like
this::
integer: decinteger | bininteger | octinteger | hexinteger
decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
bininteger: "0" ("b" | "B") (["_"] bindigit)+
octinteger: "0" ("o" | "O") (["_"] octdigit)+
hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
nonzerodigit: "1"..."9"
digit: "0"..."9"
bindigit: "0" | "1"
octdigit: "0"..."7"
hexdigit: digit | "a"..."f" | "A"..."F"
For floating-point and complex literals::
floatnumber: pointfloat | exponentfloat
pointfloat: [digitpart] fraction | digitpart "."
exponentfloat: (digitpart | pointfloat) exponent
digitpart: digit (["_"] digit)*
fraction: "." digitpart
exponent: ("e" | "E") ["+" | "-"] digitpart
imagnumber: (floatnumber | digitpart) ("j" | "J")
Constructors
------------
Following the same rules for placement, underscores will be allowed in
the following constructors:
- ``int()`` (with any base)
- ``float()``
- ``complex()``
- ``Decimal()``
Further changes
---------------
The new-style number-to-string formatting language will be extended to
allow ``_`` as a thousands separator, where currently only ``,`` is
supported. This can be used to easily generate code with more
readable literals. [11]_
The syntax would be the same as for the comma, e.g. ``{:10_}`` for a
width of 10 with ``_`` separator.
For the ``b``, ``x`` and ``o`` format specifiers, ``_`` will be
allowed and group by 4 digits.
Prior Art
=========
Those languages that do allow underscore grouping implement a large
variety of rules for allowed placement of underscores. In cases where
the language spec contradicts the actual behavior, the actual behavior
is listed. ("single" or "multiple" refer to allowing runs of
consecutive underscores.)
* Ada: single, only between digits [8]_
* C# (open proposal for 7.0): multiple, only between digits [6]_
* C++14: single, between digits (different separator chosen) [1]_
* D: multiple, anywhere, including trailing [2]_
* Java: multiple, only between digits [7]_
* Julia: single, only between digits (but not in float exponent parts)
[9]_
* Perl 5: multiple, basically anywhere, although docs say it's
restricted to one underscore between digits [3]_
* Ruby: single, only between digits (although docs say "anywhere")
[10]_
* Rust: multiple, anywhere, except for between exponent "e" and digits
[4]_
* Swift: multiple, between digits and trailing (although textual
description says only "between digits") [5]_
Alternative Syntax
==================
Underscore Placement Rules
--------------------------
Instead of the relatively strict rule specified above, the use of
underscores could be less limited. As seen in other languages, common
rules include:
* Only one consecutive underscore allowed, and only between digits.
* Multiple consecutive underscores allowed, but only between digits.
* Multiple consecutive underscores allowed, in most positions except
for the start of the literal, or special positions like after a
decimal point.
The syntax in this PEP has ultimately been selected because it covers
the common use cases, and does not allow for syntax that would have to
be discouraged in style guides anyway.
A less common rule would be to allow underscores only every N digits
(where N could be 3 for decimal literals, or 4 for hexadecimal ones).
This is unnecessarily restrictive, especially considering the
separator placement is different in different cultures.
Different Separators
--------------------
A proposed alternate syntax was to use whitespace for grouping.
Although strings are a precedent for combining adjoining literals, the
behavior can lead to unexpected effects which are not possible with
underscores. Also, no other language is known to use this rule,
except for languages that generally disregard any whitespace.
C++14 introduces apostrophes for grouping (because underscores
introduce ambiguity with user-defined literals), which is not
considered because of the use in Python's string literals. [1]_
Implementation
==============
A preliminary patch that implements the specification given above has
been posted to the issue tracker. [12]_
References
==========
.. [1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3499.html
.. [2] https://dlang.org/spec/lex.html#integerliteral
.. [3] https://perldoc.perl.org/perldata#Scalar-value-constructors
.. [4] https://web.archive.org/web/20160304121349/http://doc.rust-lang.org/reference.html#integer-literals
.. [5] https://docs.swift.org/swift-book/ReferenceManual/LexicalStructure.html
.. [6] https://github.com/dotnet/roslyn/issues/216
.. [7] https://docs.oracle.com/javase/7/docs/technotes/guides/language/underscores-literals.html
.. [8] http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html#2.4
.. [9] https://web.archive.org/web/20160223175334/http://docs.julialang.org/en/release-0.4/manual/integers-and-floating-point-numbers/
.. [10] https://ruby-doc.org/core-2.3.0/doc/syntax/literals_rdoc.html#label-Numbers
.. [11] https://mail.python.org/pipermail/python-dev/2016-February/143283.html
.. [12] http://bugs.python.org/issue26331
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: