-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
string interpolation, juxtaposition & multiline strings #2301
Comments
There was a point, I think, where string concatenation created |
@JeffBezanson hates rope strings ;-). Currently, yes, I believe RopeStrings are only created explicitly. The string stuff is going to need an overhaul at some point for greater performance, so I wouldn't depend on the precise representation of strings in general. Fortunately, you should be able to write code to generic strings, no? |
That's fine, just wanted to clarify the intended behavior. The only time I'd care is when building up large strings dynamically in many parts. In such situations, it is fine to create a rope explicitly. |
This is a good proposal. All I'd say is we should also get rid of RopeStrings are the wrong default, since if you use them to concatenate small strings performance is terrible. In the worst case where each RopeString node has 1 character, it uses 30x more memory. |
What? Who is this? "This was a cute experiment, but it's bad."? Like the @RealStefanKarpinski would ever say something like that, please. "You'll pry * for string cat out of my cold, dead, algebraically correct hands", that's what @RealStefanKarpinski would say. |
I think he's going to have to create that github account now :) |
...I was liking |
That's a fair point, @pao. But repetition isn't very common. There are a number of bad problems with |
Yeah, I've seen some of the problems with extension to |
So one of the major issues is The |
We might want to go the rest of the way and make
But if we stopped using |
@JeffBezanson: I'm fine with deleting |
Insisting that |
Why? |
True. My point is more that Having |
If Wiki: http://en.wikipedia.org/wiki/Concatenation
|
http://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28strings%29#Concatenation |
I've never seen anyone use |
I'm just saying that |
That's not true at all – Chars don't have an encoding, they are code points, which are completely encoding-independent – that's the whole point. Unicode code points are intrinsically integers. |
+1 to string interpolation in the parser. Also, it seems nice and consistent that when juxtaposition is meaningful, it coincides with multiplication. |
I dislike the a "" b concatenation approach and would be equally happy with * or +. What I don't understand is the expectation that you should be able to concatenate chars in the same way as strings. 'a''b' should either be integer multiplication or undefined. Require an explicit string('a','b') if you need to concatenate chars or turn them into strings separately first. Likewise 2'a' should either be integer multiplication or undefined. 2_3_"x" should be "6x" in accordance with operators.jl line 44. |
Unicode code points serve to enumerate characters. The ordering of Unicode characters is a convention, and in the least significant bits doesn't necessarily mean anything. However, it is also true that the values of code points have some meaning, particularly in the most significant bits (separating code planes). So We may want access to both interpretations, but which should be the default? And should it have anything at all to do with how we deal with |
Regarding I also don't think it's necessary to handle every combination of (Of course, an alternative is just to break down and use |
( I like the python way of |
For now I'm retracting the deprecation of |
Ok, this should all be finished, except for string juxtaposition. I had to create separate In order to remove ss = sort(b"$s") |
I'm not even sure what |
I once more attempted the exercise of making Char not an Integer. What happens is you need to duplicate a lot of the scalar definitions in |
I will also add that I think |
We should probably have a bikeshedding session about the non-standard string literal prefixes in Base. There are too many of them now and they're too hard to remember and the behavior of |
I agree that making Char not an Integer feels really contrived and awkward. I would be cool with discarding |
Just make sure we don't lose a replacement for |
Although I still haven't figured out why |
Yes, repeating strings is definitely necessary. I think some kind of |
@pao: the issue is whether |
Shall we get rid of |
Yes I'd say so. |
I still don't understand why the string concatenation operator must also work for Chars. Why is it so bad to require that you first make strings out of your Chars if you want to concatenate them? |
Not a problem any more. |
That's the assertion I'm challenging (as @GunnarFarneback notes). |
Well now |
String juxtaposition with ... of course, I said the same thing when I saw I do agree with Jeff that 3 syntaxes is silly. I propose that, having heard everyone's views, Stefan and/or Jeff just make an executive decision and let everyone just get used to things. (Please just make a good choice. ;-) ) |
Now that we have call overloading, it occurred to me that we can do this: julia> Base.call(s::String, args...) = join(args, s)
call (generic function with 855 methods)
julia> a, b, c = "foo", "bar", "baz"
("foo","bar","baz")
julia> ", "(a, b, c)
"foo, bar, baz" I'm not saying we should necessarily do this, but we could. The reason I was thinking this is that if we used juxtaposition for string concatenation, then you would write julia> words = [a, b, c]
3-element Array{ASCIIString,1}:
"foo"
"bar"
"baz"
julia> reduce("", words)
"foobarbaz" Slightly weird but it does have a certain internal consistency. |
I was surprised that string literal juxtaposition doesn't work. I was trying to copy-paste some C++ code that had std::string points =
"248258.441322 7417253.63825 44.2832223546\n"
"248258.909841 7417253.42727 44.066906061\n"
"248258.985642 7417253.11483 44.5358143357\n"
...
"248267.816489 7417238.83666 44.6165076596\n"; and thought that might work in Julia (with some bracketing). Should we support this? It's the same as multiplication juxtaposition, no? Unlike numbers, it "looks" more like the true result than |
@andyferris I also think it's weird that you can't concatenate string literals at the parser level. If you do points = "248258.441322 7417253.63825 44.2832223546\n" *
"248258.909841 7417253.42727 44.066906061\n" *
"248258.985642 7417253.11483 44.5358143357\n" * #...
"248267.816489 7417238.83666 44.6165076596\n" you are actually creating machine instructions for each |
Here's what I think we should do:
interp_parse
to get close, but they won't be able to handle string literals in interpolated expressions [Quotes and string interpolation #455]. That's fine. You can either have full-blown parser-supported string interpolation or you have a custom string literals, but not both at the same time. This is an acceptable compromise.UseAllow string literal juxtaposition for concatenation. Merge this branch and make string literal juxtapositionthea syntax for string concatenation. If you want to concatenate two variables as strings, you can dofoo "" bar
, which parses asstring(foo,bar)
rather thanstring(foo,"",bar)
.If you need to reference the string concatenation operation as a function, you usestring
. This is not really any different than having special syntax for e.g. indexing.Deprecate*
for string concatenation. This was a cute experiment, but it's bad. The biggest problem is theChar*Char
issue, which we should not deprecate, but rather just remove (and perhaps makeChar
a proper integer type again?).@mstr
?) for further processing. When a prefixed multiline string literal is encountered, no interpolation is done, but the string is passed off to an appropriate macro.Regarding the last point, consider the following:
Should be equivalent to:
The
@mstr
macro can handle indentation stripping a la #70 by looking at the trailing whitespace of the last string literal macro argument and stripping that from the indentation of all the string literal arguments.Prefixed triple-quoted strings should emit macro calls just like prefixed single-quoted strings do, although I'm not sure whether they should call a different macro or the same one. Having to have define different macros to support both normal and triple strings is annoying, but on the other hand, you might want to handle triple-quoted literals specially.
The text was updated successfully, but these errors were encountered: