-
-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow easy definition of sanitized substitutions #264
Comments
I feel I should link to Issue #251 as that issue also deals with dimension formatters. I say we need to think careful about improving dimension formatters in general. |
They're different formatters though, one is for the values the other is for the name. Apart from making sure the naming makes it clear which is which there isn't much of a link between the two. |
The link is that we want a consistent system for defining multiple types of formatters. In particular, it should be easy to define these formatters, we want a consistent way of defining such formatters, and finally we need to worry about pickling issues (e.g if a lamda function is used). For this reason, I think issue #251 is related. There is also the issue of formatting units to consider. |
Can we come to a decision here? Currently we have two formatter parameters on Dimensions:
Adding one additional formatter for the name itself along with some easy way to define this as an alias seems like the straightforward and sensible option here. By default this formatter would probably just be I also don't think we need to concern ourselves with dimension formatting any further for now. I know @jlstevens at least is concerned about proliferation of parameters on Dimensions but I think at least |
Sounds fine to me. Renaming format_string and/or formatter does seem appropriate, since there is no clue in their names how they are different from each other. label_formatter seems fine. |
I agree this needs to be addressed for this release (i.e now). The old names were terrible so I am looking forward to something more meaningful! Here is what I understand of the proposal above:
Ok, I am happy with this! I dislike new parameters but I dislike parameters with meaningless names even more (my main gripe from before). The new names are much clearer, and moving |
Sounds like a clean proposal. I guess that adding a lookup table containing all the Greek letters (upper and lower case) and translating anything found in the table to their LaTeX equivalents is too tricksy? That wouldn't cover other common cases like letters with super or subscripts, of course. As it is, the label_formatter will have to be set for each and every dimension that's a Greek letter, which seems painful (though better than before). I'm not actually arguing for the pre-populated global lookup table, just making sure you've considered it already and rejected it as making too many assumptions. |
@jbednar Your suggestion here is along the lines something I've just been discussing with Philipp. I still like most of the proposal above but maybe My objection is that this avoids sanitization of dimension names, but doesn't let the user define aliases for groups and labels. Here is a new suggestion to consider: hv.substitutions(dimension={'\$phi$':'phi', '$\alpha$':'alpha'},
group={'controls':'Experimental Controls',
'group1':'My Group with $pec!al characters'},
label = {'Red':'Red Channel'}) The idea is that sanitization is applied if the appropriate substitution is not defined. Note that there are 'namespaces' here (unlike the current sanitization substitution system) and that it should be easy to use by making it available at the Again, this is just an idea to consider. As always, it is best to come up with a general solution if possible! Edit: Probably no need to make it a function when a simple dictionary could work. Something like this could suffice: hv.substitutions.dimension= dict(phi='\$phi$', alpha='$\alpha$')
hv.substitutions.group=dict(controls='Experimental Controls',
group1='My Group with $pec!al characters')
hv.substitutions.label = dict(Red='Red Channel') |
I'd be okay with this but it's another system for the user to know about, but I do think it might be a good idea. There's is an inconsistency in your example though, which highlights an issue with the proposal. The group/label mappings go form short-form to long-form, while the Dimension does the opposite. So I'm somewhat unclear in which direction you think the substitutions should go. Should the user always define groups and labels that are already valid identifiers and the substitutions define the long-form used for titles (in which case group1 would be rejected)? Similarly does the user define the long-form or short-form as the Dimension name? |
I think the example I've added in the edit makes the most sense to me. Here it is again: hv.substitutions.dimension= dict(phi='\$phi$', alpha='$\alpha$')
hv.substitutions.group=dict(controls='Experimental Controls',
group1='My Group with $pec!al characters')
hv.substitutions.label = dict(Red='Red Channel') This way round helps make sure the sanitized input is valid (python does that for you). Although it is a new system for users to know about, users would have to learn about We already have two instances of sanitize_identifier = sanitize_identifier_fn.instance()
dimension_sanitizer = sanitize_identifier_fn.instance(capitalize=False) The new suggestion would hook into these instances to define the substitutions. Not sure if we want to keep group/label substitutions together or introduce My other concern is about making this substitution list effectively global. That comes with pros and cons too... |
A third alternative (that I might like the most) is to extend the same short tuple proposal for If you wanted: hv.Curve(label=('label1','My Long Label'), group=('grp', 'Money in $'),
kdims=[hv.Dimension(('dollars', '$'))]) Bit awkward but you can use tuples as 'objects' for these aliases: label1 = ('label1','My Long Label')
money = ('money', 'Money in % profits')
dollars = ('dollars', '$') Then the definition would be: hv.Curve(label=label1, group=money, kdims=[hv.Dimension(dollars)]) I think this might be a simpler and more consistent approach. What do you think? |
My example above bothers me as I wouldn't want to clutter the namespace with random names. Just to tidy things up (the fundamental suggestion of working with tuples is unchanged) you can imagine a simple labels = hv.aliases(my_label='Some Label', my_other_label='Some Other Label')
groups = hv.aliases(money = '$$')
hv.Curve(label=labels.my_label, group=groups.money) This would give you 1. tab completion (convenient!) 2. a cleaner namespace 3. and would be simply a shortcut for supplying the tuples (and therefore entirely optional). I think I really like this proposal. What are your thoughts on this? |
Seems good to me. |
As I've dealt with the other issues assigned to me for the 1.4 release, I'll tackle this now. Philipp is currently working on an important PR and has a couple of issues outstanding (most of which I think will be easy fixes). |
As the PR is now merged, I can finally close this issue! |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
For the next release, we should make sure it is easy for the user to define convenient substitutions for the sanitized strings, particularly dimension names (which can get very unwieldy, especially when LateX is involved)
One suggestion is to allow definition of an alias at the dimension level e.g by specifying the name as a tuple. Alternatively, we should extend the formatter system to handle custom string sanitation and the idea of a tuple for a name could just be a shortcut to defining the appropriate formatter.
The text was updated successfully, but these errors were encountered: