-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tags: an attempted overview #2826
Comments
It wouldn't be too hard to change I think that moving to a namespace for most items would be a good thing. We can create a core: or jupyter: namespace to which eventually |
The most recent suggestion was that the namespace for core jupyter tags would in theory be the null string with no colons. The goal is not to make other package namespaces inaccessible (that would be basically impossible), but rather to make their use as explicit as possible. I know that we could modify the code to respect the convention that was proposed, but it's dangerous to have magic values cover so much area in the space of potential strings. I'd much prefer if we're going to allow both of those in, that we restrict it to the exact two strings that need to be covered to maintain the current use-case. More generally, though, my point was that we need to come to a consensus on this now before 5.1 is released. Once 5.1 is released, we have to live with that decision for backwards compatibility reasons. As you can see it's already really complicated; I fear it will only get more complicated if we aren't pretty strict about enforcing this as soon as possible. And regarding duplication… as long as it's a thing we need to do once and never again I think we can probably avoid this problem by switching to conventions and making some way to overwrite the code that you just wrote. The hard part is that nbval would then need to also be a notebook_extension… which those devs might not want to do. It's worth noting that for switching conventions, it wouldn't be hard to write a preprocessor in nbconvert that would automatically convert metadata from one convention to another (something like this in preprocess.cell). old_conv=re.compile(r"nbval-(.*)")
new_conv= "nbval:"
cell.metadata.tags = [new_conv + old_conv.match(tag).group(0) if old_conv.match(tag) else tag for tag in cell.metadata.tags ] …which you could apply once to all the notebooks in your project and be done with it. Similarly you could write a custom save_hook that would apply a similar sort of transform so that you wouldn't need to actually type redundant tags even if they needed to be stored redundantly in order to handle separation of concerns appropriately. cc @gnestor sorry I didn't pull you in before. |
I think you've misunderstood what the code is doing: it's testing for membership in a list, not doing a substring match: // In this list
a = ['raises-exception']
a.indexOf('raises-exception') // -> 0
// Not in this list
b = ['foo:raises-exception']
b.indexOf('raises-exception') // -> -1 More generally, can we please not wrap tags up in specifications and namespaces and overthinking? Yes, we should try to avoid clashes, but a simple prefix like we use for |
Sorry I was thrown off… I thought the feature was specifically implemented so that nbval-raises-exception would pass as well. I thought it was a weird way to test for that. In that case, I support leaving things as is but also establishing what our rules and expectations around tags are, whether for user space reasons or cyclomatic complexity reasons. |
To further confuse things, the initial version of the PR did support using |
/cc @ewmassey for papermill + tags |
Thank you for the summary @mpacer. As is pointed out, I'm both against frontends using tags as part of their state AND an offender as a papermill contributor / user. To bring clarity to my hypocrisy, I'll explain:
|
JupyterLab currently does not use any tag data. We opened jupyterlab/jupyterlab#2412 as a feature parity issue to implement the |
|
My own understanding of tags is this:
* For third party users only - official jupyter projects shouldn't use them
so nbconvert shouldn't use them.
* Third parties *should* use some sort of namespacing
* Third parties can interpret them in any way they want
* Frontends can only use tags in a completely generic manner (tag based
filtering for example) and should not have particular UI/UX behavior
associated with "special tags".
The counterside of this is:
* Anything tag like that we want to (official jupyter projects) do should
go into cell metadata and be added to the notebook format. This will
prevent us from creating informal pseudo-specifications.
Based on this, I agree with Kyle that `raises-exception` should be added to
the schema and documentation. But I would prefer it be moved to metadata.
|
Is this correct:
Wouldn't it be easier to have tags that DO have special functionality in the core software? Or am I confused? |
My understanding is that that 1,2,3 are correct.
There are a couple of challenges with using tags for special functionality
in the core software:
* Tags alone are not sufficient as they can only encode very simple
information. If we embrace the usage of tags for official functionality, we
will have to answer the "tag or metadata?" question for each and every bit
of official data. From an organizational and communication perspective, it
is much easier to make the decision a single time.
* Any special functionality of core jupyter software needs to be in the
notebook format and official documentation. It is simpler conceptually to
not have a portion of our spec that get's into particular tags. Also, we
don't want to communicate to others that they can ask us to support
additional official tags.
* It is easier to deal with metadata than tags from a code perspective.
…On Thu, Sep 7, 2017 at 1:21 PM, Doug Blank ***@***.***> wrote:
Is this correct:
1. tags are easy to add/change/delete for the user
2. tags that do things (other than filtering) are only used by 3rd
party software
3. standard jupyter functionality (like hiding a cell, tagging for
special CSS) has to be done through another means (metadata)
Wouldn't it be easier to have tags that DO have special functionality in
the core software? Or am I confused?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2826 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABr0NQo0Bl8j2bJYzKdW5OA5bO3_Dz7ks5sgFBIgaJpZM4PKuHt>
.
--
Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
[email protected] and [email protected]
|
Here is another issue:
But I hope that there is a UI/UX developed that is as easy as tags for doing official functionality. Perhaps that is what is missing from the discussion. |
@dsblank My understanding is that we need modifications to 2. & 3. above: from: 2. tags that do things (other than filtering) are only used by 3rd party software to: 2. specific tag values are only used by 3rd party software (caveat from: 3. standard jupyter functionality (like hiding a cell, tagging for special CSS) has to be done through another means (metadata) to: 3. standard jupyter functionality (like hiding a cell, tagging for special CSS) can be done through generic metadata or via a generic tag-based interface that requires explicit user-specification The key to both nuances is that we can create ways for users to access specific functionality (hiding a cell, tagging for special CSS) but where to do so requires that the users both specify what tags they want to use and then apply those particular tags to their cells. We don't have a general way of surfacing that kind of functionality (today) in the notebook front end, but that's a matter of implementation. For example, it should be possible (once I rework #2413) to access tag-based element filtering from the "Download As…" menu item by uploading traitlets values via a json config file that specifies which tags indicate which kind of filtering. I imagine other front-end features could similarly use loading configuration to enable these kinds of generic tag interfaces. However, that seems like way too much overhead to be the only mechanism for exposing this functionality. My thoughts as to how to do this more simply are still evolving. |
Here's an interpretation for notebook wide metadata, largely targeting the authors, descriptions, and tag fields: This comes from nteract/nteract#1503 |
I think that a separate editor for metadata and tags could work, especially if it were easy to assign the metadata/formatting/tags data for a certain cell as a keyboard shortcut so that it can be applied to another cell with a few keystrokes. |
Just to clarify: Is this blocking us from releasing 5.1 or have we decided that the current implementation of |
To me it's totally acceptable and exactly the kind of thing tags are there for. I don't see any advantage of saying that only third-party projects can define tags. I'm already frustrated by how long it's taken to get 5.1 out the door, so I really hope we don't want to block it on this issue. |
It's not in saying that only third-party projects and users can create tags. People are saying if you want defined expected Jupyter behavior, it better be well specified and likely put in the metadata. |
As for the release being blocked, just ship with |
Tags are in metadata, and I don't see anything wrong with specifying some particular ones, like |
I agree with @rgbkrk that we should ship 5.1 with In general, I agree with @takluyver that tags were created with the specific purpose of providing a single UI for defining labelling and behavior without having to create new, dedicated UI for every flag, which is required for any other metadata structure we come up with. Of course, most metadata won't fit in tags, and some will have logical UI other than tags, but things like |
The distinction between "expected Jupyter behavior" (leading to a format spec change) and "supported by a few tools" (could be implemented by a tag convention) can be confusing when one official reference frontend is by far most popular and implements a specific convention. I think we want to avoid "the implementation is the standard" - but of course it can be difficult to set that expectation about what is implementation and what is the standard. @takluyver - it sounds like you don't expect all frontends to implement the |
I don't consider it an imperative, but where frontends have similar 'execute all' functionality, I wouldn't be surprised if users ask for similar behaviour around the tag. We are defining the tag as a marker that the tagged cell is expected to raise an error, but it's up to tools how and whether to use that.
In principle, yes. But let's not stifle ourselves by trying to write the standards before we allow ourselves to use anything. Let's make tags useful first, and then try to formalise them. Function annotations in Python are an example of what I mean. The syntax has been supported since Python 3.0, but core developers said "we won't define any uses, we'll let third parties come up with them". So they went basically unused until those core devs came back and defined them as type hints. |
It sounds like everyone on the different sides of the issue agree on this release going out either way, so I'm going to close this and I think we can start on a new issue in nbformat. |
Thanks!
…On Thu, Sep 14, 2017 at 9:42 AM, M Pacer ***@***.***> wrote:
It sounds like everyone on the different sides of the issue agree on this
release going out either way, so I'm going to close this and I think we can
start on a new issue in nbformat.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2826 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABr0B06qty5TtmhinUMoj8owLZCeZRXks5siVdfgaJpZM4PKuHt>
.
--
Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
[email protected] and [email protected]
|
Jupyter-wide policy regarding tags in the core projects
We really need to make a decision about core projects handle cell tags before the final release of notebook 5.1.
There has been ongoing discussion amongst @rgbkrk @takluyver @jasongrout @ellisonbg @Carreau and others regarding the purpose of tags… Right now we are not shipping any magic values with core Jupyter projects.
N.B.: by "magic values", I mean values that are hard-coded and that map onto specific behaviour by default, usually with no ability for users to override this behaviour mapping.
Notebook 5.1 includes a magic value (#2549) which violates some of the expectations that had arisen in earlier discussions around allowed magic vlaues.
psuedo-namespaces with colons
@jasongrout and @ellisonbg had been extremely concerned about stepping on users' namespaces in any privileged jupyter tags. This would allow other projects/users to assign their own "namespace" by prefixing their tags with the namespace value postceded by a colon (e.g.,
namespace:tag
).We had discussed the possibility of saving for jupyter the "null string" namespace (which would then require no
:
. This would make theraises_exception
tag a candidate for meeting that namespace requirement. Unfortunately that is not the only value that matches the current "magic value".Potential problem:
"*raises_exception*"
matchingRight now, #2549 makes the namespaces convention impossible. It includes any string that contains the characters
raises_exception
anywhere inside of it, regardless of the namespace+:
that might prefix it.The value is not customisable, and is specified deep within the codecell execute functionality meaning it will not be able to be easily overwritten:
notebook/notebook/static/notebook/js/codecell.js
Lines 314 to 325 in d4c6fe4
Compromise to resolve issues around #2549
I think the simplest suggestion would be to allow
raises_exception
andnbval_raises_exception
as the two tags that the corenotebook
"knows" about. This greatly reduces namespace stompage while not breaking the behaviour that @takluyver and @mscuthbert introduced in merging #2549. It's less than ideal to have a non-core project determining magic values inside the core project… but without making it configurable via a traitlet (or something), I can't think of another way to handle this.Even so it would seem to be a good idea to modularise this chunk of functionality so that it can be overwritten by external libraries without needing to rewrite the entirety of
CodeCell.prototype.execute
just to access this one feature.Document how tags are used outside of changelogs.
Now that tags are supported in at least a couple of different ways in the core projects and in 3rd party projects, we should explicitly document those ways of using them. I list the ones I know about at the bottom of this issue.
We need to start this effort and its closely tied to how we decide to handle the policy in general. Here are a couple of my initial thoughts:
a.
name
is specified incell.metadata
as an explicit unique id across notebooki. we should have an interface similar cell tags that allows assigning a single unique name to each cell that will enforce the uniqueness condition as a part of the UI
Examples of tags as user-defined values
TagRemovePreprocessor.remove_cell_tags
: removes cells with specified tags (Tag remove preprocessor nbconvert#640)TagRemovePreprocessor.remove_input_tags
: removes inputs with specified tags (enables input removal via preprocessor, template and invalid nb intermediate representation nbconvert#643)TagRemovePreprocessor.remove_all_outputs_tags
: removes outputs with specified tags (Tag remove preprocessor nbconvert#640)Examples of tags as hard-coded magic values
notebook
"*raises-exception*"
: allows exceptions to pass allow raises-exception to continue #2549computationalmodelling/nbval
:"pytest-validate-ignore-output"
: avoids testing cell (deprecated)"nbval-ignore-output"
: avoids testing cell"nbval-check-output"
: ensures testing cell even with--nbval-lax
"nbval-raises-exception"
: allows exceptions to pass (analogous to notebook value)"nbval-skip"
: skips cell during testing"parameters"
: identifies a single unique cell (https://github.com/nteract/papermill/blob/master/papermill/execute.py#L157)Are there other projects that the core in any way maintains that use tags in either of these ways?
Regarding jupyterlab: @blink1073 @ian-r-rose @afshin are tags being used for anything meaningful in jupyterlab as of yet? Is there an official policy of how to handle them there?
The text was updated successfully, but these errors were encountered: