Add support for loading metadata files #223

cmyr · 2018-11-02T22:53:18Z

This adds initial support for loading metadata from .tmPreferences
files. The main focus of this patch is loading the patterns used to
determine indentation. I expect to follow up with support for comment
strings, with support for other features possible in the future.

This is all gated behind a new metadata feature. If enabled, a
top-level Metadata object will be included in SyntaxSet; this object
can be queried for the metadata available for a given scope.

This patch does not attempt to implement any actual indentation logic;
that is left up to the consumer.

So I've added a DefaultPackage package to testdata; it's just two files, but maybe we should just put it in the repo with the other packages?

I've generally tried to keep the scope of this as narrow as possible, no bells & whistles; happy to iterate from here.

I've implemented simple autoindent (just using increaseIndentPattern and decreaseIndentPattern) in xi using this, and I'm happy with the results.

Some of the logic around loading is a bit gnarly right now; I was doing a bunch of logging to figure out what all metadata fields existed, etcetera. So let's call this a rough cut?

trishume

Looks pretty good. Some comments. Still may want to spend some more time wrapping my head around the approach of converting the files to the final metadata in case I can see a cleaner way.

trishume · 2018-11-04T21:39:20Z

src/parsing/metadata.rs

+            if !KEYS_WE_USE.contains(&key.as_str()) {
+                continue;
+            }
+            //NOTE: because we can't guarantee the order that files get loaded,


Doesn't this conflict with the doc comment about last writer winning? I think Sublime just does last writer wins and loads the packages in alphabetical-ish order or something. This seems like it would be pretty unintuitive behaviour.

Comment is out of date and I agree just doing alphabetical order on the files is much simpler, will make that change.

trishume · 2018-11-04T21:46:17Z

src/parsing/syntax_set.rs

+
+    /// The metadata items that match the given scope. The result may be empty.
+    #[cfg(feature = "metadata")]
+    pub fn metadata_for_scope(&self, scope: &[Scope]) -> ScopedMetadata {


This helper doesn't seem like it saves enough code to be worth it to me.

trishume · 2018-11-04T21:55:55Z

src/parsing/metadata.rs

+}
+
+
+impl Pattern {


I was going to say that it would be nice if we could refactor SyntaxDefinition to use this as well, but I see that the fact that regex_str is public would make that a breaking change. I may still want to do that eventually but doesn't have to be this PR.

trishume · 2018-11-04T22:04:07Z

src/parsing/metadata.rs

+                    .map(|score| (score, meta_set))
+            }).collect::<Vec<_>>();
+
+        metadata_matches.sort_by(|one, two| two.0.cmp(&one.0));


Hint: sort_by_key makes this cleaner

cmyr · 2018-11-06T15:51:08Z

@trishume thanks for the quick look. I agree that the initial loading/conversion feels a bit over-complicated, but couldn't think of an easier approach that correctly handles adding new files.

I'll do some cleanup on this a bit later on today.

As an addendum: I'd like to add support for the comment-related members of shellVariables, which I could either add on here or do as a follow-up PR. In any case, I see three basic approaches to this:

expose a shellVariables that includes all the keys in the metadata files, and just leave it entirely up to the consumer

extract the comment related metadata items, and have, say,

struct MetadataItems {
    // ...
    comment_one: Option<(String, Option<String>)>,
    comment_two: Option<(String, Option<String>)>,
    comment_three: Option<(String, Option<String>)>,
}

or just go ahead and use the interface we expect the client to want, which is more like,

struct MetadataItems {
    // ...
    line_comment: Option<String>,
    block_comment: Option<(String, String)>,
}

This latter is the cleanest and the easiest to use but requires us to make some assumptions and to throw away some metadata. It feels like the best approach for now, but if you have an opinion I'm happy to hear it.

keith-hall · 2018-11-06T19:35:39Z

I personally vote for:

expose a shellVariables that includes all the keys in the metadata files, and just leave it entirely up to the consumer

as it is the most flexible/useful :)

cmyr · 2018-11-06T19:40:23Z

Okay I've updated this to simplify the merging logic and address some of the other feedback.

I've noticed a probable bug here, as well: we go to pretty major effort to keep around the 'raw' metadata, so that we can correctly merge if new files are added; however we don't include the raw metadata when we do gendata, so adding new metadata to a syntax set that was loaded from a dump won't work. I'll resolve this in an upcoming commit.

cmyr · 2018-11-06T21:28:04Z

Okay, an issue with the raw metadata stuff:

bincode doesn't support deserialize_any, which means we can't serialize serde_json::Value, so we can't really include the raw metadata in its current form in the pack files. (Including the raw metadata also more than doubles the size of the pack files.)

We could take this as an opportunity to simplify the overall approach; we could discard raw metadata when we build the SyntaxSet, and if the syntax set goes back to a builder we reset it. In this world adding more metadata after having generated a SyntaxSet would mean the new metadata wouldn't merge with the old; a scope that was present in the new metadata would overwrite anything that already existed.

This seems reasonable for any use-cases that I can imagine, and generally simplifies things; let me know if it makes sense to you?

cmyr · 2018-11-06T22:43:28Z

I've gone ahead and added a commit that makes the change described above. Let me know if you prefer a different approach.

trishume · 2018-11-11T01:48:06Z

examples/gendata.rs

@@ -2,26 +2,40 @@
 //! syntect, not as a helpful example for beginners.
 //! Although it is a valid example for serializing syntaxes, you probably won't need
 //! to do this yourself unless you want to cache your own compiled grammars.
+//!
+//! The standard command to generate the syntax dumps (including metadata) is:


Can you replace the command under make packs in the Makefile with this? Then maybe replace this comment with a reference to this script being used in the Makefile using make packs

trishume · 2018-11-11T02:15:23Z

src/parsing/metadata.rs

+        where F: FnMut(&MetadataItems) -> Option<T>
+    {
+        self.items.iter()
+            .map(|(_, meta_set)| &meta_set.items)


Do you know that it's the case that Sublime does this trying multiple patterns thing? Seems rarely applicable and I wouldn't be surprised if Sublime only tries the matchiest, which would let you simplify a bunch of this.

I believe sublime does it this way; this is why we need the new default package, because it includes some base settings for the source and comment scopes. @keith-hall suggested this implementation in #183 or #179.

trishume

Just two little things, sorry I missed these on previous passes.

trishume · 2018-11-11T20:38:02Z

src/highlighting/mod.rs

@@ -3,7 +3,7 @@
 //! settings like selection color, `ThemeSet` for loading themes,
 //! as well as things starting with `Highlight` for how to highlight text.
 mod selector;
-mod settings;
+pub(crate) mod settings;


This was previously public and even though I think literally nobody uses it I'd rather not break semver compatibility for little reason.

this was previously private, I exposed it because I needed StackSelectors.

er, not StackSelectors but Settings and SettingsObject? because I'm loading plists.

edit: I can avoid this and just redeclare those typedefs locally, I do this already for SettingsObject.

oops derp cool

looked at this again, it's because I'm using load_plist. Could reexport manually if that makes more sense?

trishume · 2018-11-11T20:40:37Z

src/parsing/metadata.rs

+
+    /// Generates a `MetadataSet` from a single file
+    #[cfg(test)]
+    pub fn quick_load(path: &str) -> Result<MetadataSet, LoadingError> {


Suggested change

pub fn quick_load(path: &str) -> Result<MetadataSet, LoadingError> {

pub(crate) fn quick_load(path: &str) -> Result<MetadataSet, LoadingError> {

This adds initial support for loading metadata from `.tmPreferences` files. The main focus of this patch is loading the patterns used to determine indentation. I expect to follow up with support for comment strings, with support for other features possible in the future. This is all gated behind a new `metadata` feature. If enabled, a top-level `Metadata` object will be included in `SyntaxSet`; this object can be queried for the metadata available for a given scope. This patch does not attempt to implement any actual indentation logic; that is left up to the consumer.

This also includes a bit of PR feedback and cleanup.

trishume · 2018-11-12T01:21:15Z

Oh whoops derp one last request sorry I keep forgetting things: Can you modify travis.yml to pass --features metadata to cargo test so that CI tests the new code?

cmyr · 2018-11-12T01:23:00Z

@trishume yep can do that now

cmyr · 2018-11-12T02:09:53Z

Okay, should be good to go?

trishume

Thanks for all this!

cmyr · 2018-11-12T02:16:47Z

Thanks for taking the time out of your Sunday to get it merged, much appreciated.

cmyr force-pushed the feature/metadata branch from b7b93fd to b3bb1fb Compare November 2, 2018 22:55

cmyr mentioned this pull request Nov 2, 2018

Simple syntect-based autoindent xi-editor/xi-editor#971

Merged

4 tasks

trishume reviewed Nov 4, 2018

View reviewed changes

cmyr force-pushed the feature/metadata branch from ff456ec to 08e8978 Compare November 6, 2018 19:20

cmyr force-pushed the feature/metadata branch from 08e8978 to 09ac1bd Compare November 6, 2018 19:37

cmyr force-pushed the feature/metadata branch from a75c4bb to f0d581c Compare November 8, 2018 19:46

cmyr mentioned this pull request Nov 9, 2018

Add support for metadata 'shellVariables' #225

Merged

trishume reviewed Nov 11, 2018

View reviewed changes

cmyr added 5 commits November 11, 2018 16:00

Simplify loading and merging metadata

77e5365

This also includes a bit of PR feedback and cleanup.

Simplify handling of raw metadata

b70c242

Metadata: improve docs and interfaces

0cca45e

Makefile generates metadata packdump

7381243

cmyr force-pushed the feature/metadata branch from e177132 to 7381243 Compare November 11, 2018 21:01

Update travis to test metadata feature

3fc536e

cmyr force-pushed the feature/metadata branch from 106c952 to 3fc536e Compare November 12, 2018 01:44

trishume approved these changes Nov 12, 2018

View reviewed changes

trishume merged commit c2e857f into trishume:master Nov 12, 2018

cmyr deleted the feature/metadata branch November 12, 2018 02:16

jmacdonald mentioned this pull request Dec 8, 2020

Add command to comment out a selection of text jmacdonald/amp#204

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for loading metadata files #223

Add support for loading metadata files #223

cmyr commented Nov 2, 2018 •

edited

Loading

trishume left a comment

trishume Nov 4, 2018

cmyr Nov 6, 2018

trishume Nov 4, 2018

trishume Nov 4, 2018

trishume Nov 4, 2018

cmyr commented Nov 6, 2018

keith-hall commented Nov 6, 2018

cmyr commented Nov 6, 2018

cmyr commented Nov 6, 2018

cmyr commented Nov 6, 2018

trishume Nov 11, 2018

cmyr Nov 11, 2018

trishume Nov 11, 2018

cmyr Nov 11, 2018

trishume left a comment

trishume Nov 11, 2018

cmyr Nov 11, 2018

cmyr Nov 11, 2018 •

edited

Loading

trishume Nov 12, 2018

cmyr Nov 12, 2018

trishume Nov 11, 2018

cmyr Nov 11, 2018

trishume commented Nov 12, 2018

cmyr commented Nov 12, 2018

cmyr commented Nov 12, 2018

trishume left a comment

cmyr commented Nov 12, 2018

	pub fn quick_load(path: &str) -> Result<MetadataSet, LoadingError> {
	pub(crate) fn quick_load(path: &str) -> Result<MetadataSet, LoadingError> {

		}


		impl Pattern {

Add support for loading metadata files #223

Add support for loading metadata files #223

Conversation

cmyr commented Nov 2, 2018 • edited Loading

trishume left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmyr commented Nov 6, 2018

keith-hall commented Nov 6, 2018

cmyr commented Nov 6, 2018

cmyr commented Nov 6, 2018

cmyr commented Nov 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trishume left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmyr Nov 11, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trishume commented Nov 12, 2018

cmyr commented Nov 12, 2018

cmyr commented Nov 12, 2018

trishume left a comment

Choose a reason for hiding this comment

cmyr commented Nov 12, 2018

cmyr commented Nov 2, 2018 •

edited

Loading

cmyr Nov 11, 2018 •

edited

Loading