-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement string interpolation as a call to String.interpolation #8400
Implement string interpolation as a call to String.interpolation #8400
Conversation
😻 now I need to update my next presentation that mention interpolation internals. just on time. Any comparison whether it was better the inline approach for creating the string builder? Because after this PR there will many |
# In this case the implementation just returns the same string value. | ||
# | ||
# NOTE: there should never be a need to call this method instead of using string interpolation. | ||
def self.interpolation(value : String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this overload? It's equivalent to the non-restricted overload. String#to_s
returns self
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right
I like the syntax of |
Technically, making |
@bcardiff Good question! I didn't measure how it affects compilation time or memory. On one hand the old approach generated more code to type (multiple calls). The new code just generates a call, which means that if in your program you have string interpolation with same types the compiler will have those instances cached. But then it will need to create a different tuple type for each combination. But compiling the compiler took the same time/memory, same for specs, so maybe it doesn't matter. I was going to mention all of this in this PR but I didn't because I didn't conclude anything about it. |
Good point. But what's a case where you want to have a String with the same contents but different memory address? |
When you do unsafe stuff with it, such as passing it to a C library that modifies the string in place. |
Let's try this change and see if it breaks something. Using |
Or maybe we can just duplicate the string. I guess nobody would do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what about moving this to another file string/interpolation.cr
?
While it's a breaking change, wouldn't it be better to still have this optimization in the long run? |
Docs says that |
Am I the only one who finds this implementation of Lines 4276 to 4278 in 0e2e1d0
|
Strings are immutable so there's no point in dup allocating a new string. The more memory we can save, the better. I also agree with reverting my last commit. I'm pretty sure it's harmless. |
This is great. I'm really looking forward to the constant inlining as well. But should this maybe be called |
Yes, the name can be
|
We could consider |
concat was the first name I was going to use, so 👍 from me on that |
EDIT: BTW, how does the general case compare to Tuple#join ? Would it make sense to do similar allocation size optimizations in Enumerable#join? |
@yxhuvud We can't optimize |
I think we could optimize Tuple#join and then make this PR use that, it's slightly more complex because I have to do the size and type checks all inside a single method. Plus I liked the idea of having a specific method representing the implementation of string interpolation. |
It's not a breaking change since you need to use unsafe methods to observe it. I like the idea of this PR, but compile-time and other benchmarks would be great. |
Actually, In case you were using |
I think nobody is doing So I wouldn't worry about this. If it's a breaking change then it's fine, we just document what's the expected behavior and provide an upgrade path (Changelog: notice that |
Totally agree. It's fine to change it. But that's a breaking change. |
It might be a case of some polymorphic function that ends up doing that. Either way, we are fine with the new behaviour: returning the same object. Strings are immutable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although we still need #8400 (comment)
Co-Authored-By: Benoit de Chezelles <[email protected]>
... and I'd like to have at least some benchmarks on the performance impact on compile time and runtime. |
I tried this code: world = "hello"
baz = 1
{% for i in 1..100_000 %}
"hello #{world} foo bar #{baz}"
{% end %} Current compiler:
This PR:
That's probably because Another test that could be done is generating many string interpolations with different types and number of arguments. But note that the above program is 100_000 different string interpolations and it doesn't take that much time, so I think this change is safe for performance (and could actually be a performance improvement). |
With that I also mean that it's very unlikely that a program will have that many different string interpolations, so this benchmark is like the worse case ever. |
Also note that there are 4069 different interpolations across this entire repository, so I think we are still safe with this change :-) |
I pushed another commit where Let's do that, mark it as a breaking change, and see if it affects anyone. If so, we can revert it to return a copy. |
d2c9aac
to
28ac4fd
Compare
I did some timings of compiling the spec suite and compiler in a spreadsheet. |
Interesting. So it's slower? Let me try it too |
I just did that experiment too. Compiling the compiler is slightly faster (around 2 seconds). The |
@asterite did you use a release mode compiler when testing compile times? Did you clean the compiler cache? My results were 17 sigma for 7% slower on compiling the compiler. But it's really interesting that it seems the slowdown and speedup is heavily based on the code being compiled... I don't want this reverted, though. I just find it interesting. I'd really like to automate speed-testing over time but no time :) |
yes |
Looks nice. test file:
:) |
Before this PR, a string interpolation like:
"hello #{world}!"
was roughly expanded to:
In this PR it's expanded to:
The benefits of doing this are:
String.interpolation
"#{value}"
, we return the samevalue
. This is uncommon, but why not?to_s
on it: much more efficient than usingString.build
String.build
approach can't do+
(uncommon, but why not?)"foo#{bar}"
you should do"foo" + bar
". String interpolation will provide the most performant implementation, always (and I plan to further optimize this by inlining constants if possible, but in a separate PR).A small benchmark:
Results: