diff --git a/AUTHORS.txt b/AUTHORS.txt index 56eda98394..7415a097a3 100644 --- a/AUTHORS.txt +++ b/AUTHORS.txt @@ -305,4 +305,5 @@ Contributors: - Jonathan Sharpe - Michael Rush - Florian Bezdeka +- Patrick Scheibe - Kyle Brown diff --git a/CHANGES.md b/CHANGES.md index 76ed1a7609..5f08790487 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -23,12 +23,21 @@ Language Improvements: - enh(php) highlight variables (#2785) [Taufik Nurrohman][] - fix(python) Handle comments on decorators (#2804) [Jonathan Sharpe][] - enh(diff) improve highlighting of diff for git patches [Florian Bezdeka][] +- enh(mathematica) Rework entire implementation [Patrick Scheibe][] + - Correct matching of the many variations of Mathematica's numbers + - Matching of named-characters aka special symbols like `\[Gamma]` + - Updated list of version 12.1 built-in symbols + - Matching of patterns, slots, message-names and braces Dev Improvements: - chore(dev) add theme picker to the tools/developer tool (#2770) [Josh Goebel][] - fix(dev) the Vue.js plugin no longer throws an exception when hljs is not in the global namespace [Kyle Brown][] +Parser: + +- enh(grammars) allow `classNameAliases` for more complex grammars [Josh Goebel][] + New themes: - *StackOverflow Dark* by [Jan Pilzer][] @@ -41,8 +50,10 @@ New themes: [Jan Pilzer]: https://github.com/Hirse [Jonathan Sharpe]: https://github.com/textbook [Michael Rush]: https://github.com/rushimusmaximus +[Patrick Scheibe]: https://github.com/halirutan [Kyle Brown]: https://github.com/kylebrown9 + ## Version 10.3.1 Prior version let some look-behind regex sneak in, which does not work diff --git a/docs/mode-reference.rst b/docs/mode-reference.rst index c32e705d43..4c058a5e63 100644 --- a/docs/mode-reference.rst +++ b/docs/mode-reference.rst @@ -1,4 +1,4 @@ -Mode reference +Mode Reference ============== Types @@ -23,13 +23,24 @@ Types of attributes values in this reference: +------------+-------------------------------------------------------------------------------------+ -Attributes ----------- +Language Only Attributes +------------------------ + +These attributes are only valid at the language level (ie, they many only exist on the top-most language object and have no meaning if specified in children modes). + + +name +^^^^ + +- **type**: string + +The canonical name of this language, ie "JavaScript", etc. + case_insensitive ^^^^^^^^^^^^^^^^ -**type**: boolean +- **type**: boolean Case insensitivity of language keywords and regexps. Used only on the top-level mode. @@ -37,15 +48,56 @@ Case insensitivity of language keywords and regexps. Used only on the top-level aliases ^^^^^^^ -**type**: array +- **type**: array A list of additional names (besides the canonical one given by the filename) that can be used to identify a language in HTML classes and in a call to :ref:`getLanguage `. +classNameAliases +^^^^^^^^^^^^^^^^ + +- **type**: object + +A mapping table of any custom class names your grammar uses and their supported equivalencies. Perhaps your language has a concept of "slots" that roughly correspond to variables in other languages. This allows you to write grammar code like: + +:: + + { + classNameAliases: { + slot: "variable", + "message-name": "string" + }, + contains: [ + { + className: "slot", + begin: // ... + } + ] + } + +The final HTML output will render slots with the CSS class as ``hljs-variable``. This feature exists to make it easier for grammar maintainers to think in their own language when maintaining a grammar. + +For a list of all supported class names please see the :doc:`CSS class reference +`. + + +disableAutodetect +^^^^^^^^^^^^^^^^^ + +- **type**: boolean + +Disables autodetection for this language. + + + +Mode Attributes +--------------- + + className ^^^^^^^^^ -**type**: identifier +- **type**: identifier The name of the mode. It is used as a class name in HTML markup. @@ -56,16 +108,16 @@ for one thing like string in single or double quotes. begin ^^^^^ -**type**: regexp +- **type**: regexp Regular expression starting a mode. For example a single quote for strings or two forward slashes for C-style comments. If absent, ``begin`` defaults to a regexp that matches anything, so the mode starts immediately. on:begin -^^^^^^^^^^^ +^^^^^^^^ -**type**: callback (matchData, response) +- **type**: callback (matchData, response) This callback is triggered the moment a begin match is detected. ``matchData`` includes the typical regex match data; the full match, match groups, etc. The ``response`` object is used to tell the parser how it should handle the match. It can be also used to temporarily store data. @@ -78,7 +130,7 @@ For an example of usage see ``END_SAME_AS_BEGIN`` in ``modes.js``. end ^^^ -**type**: regexp +- **type**: regexp Regular expression ending a mode. For example a single quote for strings or "$" (end of line) for one-line comments. @@ -93,9 +145,9 @@ This is achieved with :ref:`endsWithParent ` attribute. on:end -^^^^^^^^^^^ +^^^^^^ -**type**: callback (matchData, response) +- **type**: callback (matchData, response) This callback is triggered the moment an end match is detected. ``matchData`` includes the typical regex match data; the full match, match groups, etc. The ``response`` object is used to tell the parser how it should handle the match. It can also be used to retrieve data stored from a `begin` callback. @@ -106,9 +158,9 @@ For an example of usage see ``END_SAME_AS_BEGIN`` in ``modes.js``. beginKeywords -^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^ -**type**: string +- **type**: string Used instead of ``begin`` for modes starting with keywords to avoid needless repetition: @@ -140,7 +192,7 @@ Ex. ``class A { ... }`` would match while ``A.class == B.class`` would not. endsWithParent ^^^^^^^^^^^^^^ -**type**: boolean +- **type**: boolean A flag showing that a mode ends when its parent ends. @@ -169,7 +221,7 @@ This is when ``endsWithParent`` comes into play: endsParent ^^^^^^^^^^^^^^ -**type**: boolean +- **type**: boolean Forces closing of the parent mode right after the current mode is closed. @@ -215,7 +267,7 @@ endSameAsBegin (deprecated as of 10.1) ``END_SAME_AS_BEGIN`` mode or use the ``on:begin`` and ``on:end`` attributes to build more complex paired matchers. -**type**: boolean +- **type**: boolean Acts as ``end`` matching exactly the same string that was found by the corresponding ``begin`` regexp. @@ -244,7 +296,7 @@ and ``endSameAsBegin: true``. lexemes (now keywords.$pattern) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -**type**: regexp +- **type**: regexp A regular expression that extracts individual "words" from the code to compare against :ref:`keywords `. The default value is ``\w+`` which works for @@ -260,7 +312,7 @@ constant that you repeat multiple times within different modes of your grammar. keywords ^^^^^^^^ -**type**: object +- **type**: object / string Keyword definition comes in two forms: @@ -273,7 +325,7 @@ For detailed explanation see :doc:`Language definition guide `. illegal ^^^^^^^ -**type**: regexp +- **type**: regexp A regular expression that defines symbols illegal for the mode. When the parser finds a match for illegal expression it immediately drops parsing the whole language altogether. @@ -282,7 +334,7 @@ When the parser finds a match for illegal expression it immediately drops parsin excludeBegin, excludeEnd ^^^^^^^^^^^^^^^^^^^^^^^^ -**type**: boolean +- **type**: boolean Exclude beginning or ending lexemes out of mode's generated markup. For example in CSS syntax a rule ends with a semicolon. However visually it's better not to color it as the rule contents. Having ``excludeEnd: true`` forces a ```` element for the rule to close before the semicolon. @@ -291,7 +343,7 @@ However visually it's better not to color it as the rule contents. Having ``excl returnBegin ^^^^^^^^^^^ -**type**: boolean +- **type**: boolean Returns just found beginning lexeme back into parser. This is used when beginning of a sub-mode is a complex expression that should not only be found within a parent mode but also parsed according to the rules of a sub-mode. @@ -302,7 +354,7 @@ Since the parser is effectively goes back it's quite possible to create a infini returnEnd ^^^^^^^^^ -**type**: boolean +- **type**: boolean Returns just found ending lexeme back into parser. This is used for example to parse JavaScript embedded into HTML. A JavaScript block ends with the HTML closing tag ```` that cannot be parsed with JavaScript rules. @@ -314,7 +366,7 @@ Since the parser is effectively goes back it's quite possible to create a infini contains ^^^^^^^^ -**type**: array +- **type**: array The list of sub-modes that can be found inside the mode. For detailed explanation see :doc:`Language definition guide `. @@ -322,7 +374,7 @@ The list of sub-modes that can be found inside the mode. For detailed explanatio starts ^^^^^^ -**type**: identifier +- **type**: identifier The name of the mode that will start right after the current mode ends. The new mode won't be contained within the current one. @@ -333,7 +385,7 @@ Tags ``