Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable PHP syntax highlighting without <?php, also fix markdown + PHP code fence issues #16006

Closed
wants to merge 3 commits into from

Conversation

mousetraps
Copy link
Contributor

  • Provide a new language (phplanguage) for reference by services requiring PHP syntax highlighting for code without script/start end tags.
  • Add workaround to prevent phplanguage from appearing in Language Mode list
  • fix markdown + PHP fenced code blocks
    • previously PHP wasn't working at all because we don't appear to properly support injections. This issue was addressed by pointing to a specific repository key (#languages).
    • ensure "fuzzy" syntax highlighting for PHP/HTML code so that script start tags are not required

close #14166, close #3746

for reference by services requiring PHP syntax highlighting for code missing script start / end tags
- previously PHP wasn't working at all because we don't appear to properly support grammar injections. This issue was addressed by pointing to a specific repository key (#languages)
- ensure "fuzzy" syntax highlighting for PHP/HTML code so that script start tags are not required
@mention-bot
Copy link

@mousetraps, thanks for your PR! By analyzing the history of the files in this pull request, we identified @mjbvz, @egamma and @alexandrudima to be potential reviewers.

@msftclas
Copy link

Hi @mousetraps, I'm your friendly neighborhood Microsoft Pull Request Bot (You can call me MSBOT). Thanks for your contribution!


It looks like you're a Microsoft contributor (Sara Itani). If you're full-time, we DON'T require a Contribution License Agreement. If you are a vendor, please DO sign the electronic Contribution License Agreement. It will take 2 minutes and there's no faxing! https://cla.microsoft.com.

TTYL, MSBOT;

@felixfbecker
Copy link
Contributor

@mousetraps Will this require the language server to send a MarkedString with the ID phplanguage? Other client will likely only support php

// For reference by services requiring PHP syntax highlighting for partial
// code missing script start/end tags.
"id": "phplanguage",
"aliases": [ null ], // unsupported: prevents language from appearing in Language Mode list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this just be "aliases": null instead of the one-element array?

Copy link
Contributor Author

@mousetraps mousetraps Nov 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it could be! Initially, I opted to simply opted to light up the hack that was already in use because there were enough things that depended on the existence of an array, but on closer look, they already all have isArray checks, so agreed that "null" would be preferable.

Or.... I suppose one could argue that an array is nice, because you could also add additional invisible aliases? Not actually arguing it... just saying that one could do such a thing. 😜

@mousetraps
Copy link
Contributor Author

mousetraps commented Nov 25, 2016

@felixfbecker Regarding the phplanguage id, yes, and I don't really like it either. It's certainly non-ideal, but other clients could add an alias for it, as we do for many other languages. To recap a bit of the thought process here... In this case, what we want is:

  • when opening a PHP file, match script tags
  • when hovering/debugging, don't match script tags

Unfortunately TM grammars don't allow a sane way to accomplish both of these things at once, hence the impetus for the new language. Other options added far too much complexity (e.g. adding special-casing for PHP throughout the rendering code, or adding some special "hover" extension point mapping, etc.)

That said, I've been thinking about it a bit today, and started messing around with another idea, so would be curious to hear any preliminary thoughts on this.

The way we render markdown code fences is similar enough to the way we render hover / eval, and it would be great if we could just use that for every language we support. The challenge with simply rendering hover as we render code fences is that at the moment, each code fence requires a manual addition to markdown.tmLanguage, so this wouldn't work for grammars we haven't explicitly included. But, what if we generated markdown.tmLanguage code fences on the fly based on loaded grammars? That way we...

  • wouldn't be caught up with duplicating a ton of patterns for each fenced code block
  • would be able to have code fence support for any arbitrary language
  • address the PHP language issue (and potential issues w/ future languages) w/o forcing people to specify "special" languages

The other benefit of this approach is that we would actually adhere to the protocol we claim to support (always a plus 😉)

The pair of a language and a value is an equivalent to markdown:
```${language}
${value}

This latter realization was actually a bit surprising to me - I hadn't realized that was an explicit part of the spec - and it feels like something we might already have an issue open for because it looks like we actually anticipated it in the design of the protocol. Unfortunately I have failed to locate this theoretical issue. Anyone with a better sense of the history here have any ideas on this?

@mousetraps
Copy link
Contributor Author

mousetraps commented Nov 25, 2016

Just to expand a bit on my previous comment... the new proposal would be:

  • short term: render all hover responses exactly as we render the corresponding markdown code fences. This would break syntax highlighting for hovering/eval for languages not supported included in the markdown file, but enable us to better support built-in grammars like PHP
  • long term: special case markdown to automatically generate code fence patterns for other languages.

@felixfbecker
Copy link
Contributor

render all hover responses exactly as we render the corresponding markdown code fences

iirc, that is exactly how VS Code treats MarkedString responses from language servers - it just wraps them in code fences with a language tag.
Which is why I am a bit confused, in your PR you say

In fenced PHP code blocks, we match a "fuzzy" version of the PHP language

if that is true, then everything should be fine with just a php tagged MarkedString (no extra language).

Now, I don't know how other editors do this (if they auto-generate tmLanguages) but GitHub, Atom, all other editors I have met can "auto-detect" whether they should render a PHP tagged code block as "phplanguage" or "php". This is also very important because the debugger for example cannot guarantee that a <?php tag is present or not. eval()ed code is not wrapped (it can still contain them to exit out of PHP mode), but other sources that are retrieved through sourceRequest might be.

So the best solution would somehow auto-detect and not require to send non-established language IDs.

@kaloyan-raev @mniewrzal how does Eclipse Che work here?

@kaloyan-raev
Copy link
Contributor

how does Eclipse Che work here?

Eclipse Che has no syntax highlighting in hover messages yet. Code blocks are just displayed in a monospace font.

Copy link
Collaborator

@mjbvz mjbvz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Markdown grammar changes look good to me.

@mousetraps
Copy link
Contributor Author

Sorry for the delay, was on vacation, and then at a conference.

Closing this thread in favor of #16502 (markdown tm language fixes) and #16503 (properly rendering MarkedString)

@felixfbecker, to reply to your comments...

iirc, that is exactly how VS Code treats MarkedString responses from language servers - it just wraps them in code fences with a language tag.

Indeed, that is what the spec claims, but it does not represent the actual behavior. Otherwise, simply fixing the markdown grammar would have addressed the hover issue as well.

So the best solution would somehow auto-detect and not require to send non-established language IDs.

Agreed, though in this case "auto-detecting" in this case means "auto-detecting" context - if we are only given two strings, one with a <?php tag, one without, there is no way to reliably auto-detect the preferred mode ("actual" or "fuzzy"). Therefore, we must rely on other contextual clues (e.g. are we performing a hover, or are we editing a file, or are we in a md codeblock, etc.)

@mousetraps mousetraps closed this Dec 5, 2016
@github-actions github-actions bot locked and limited conversation to collaborators Mar 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PHP syntax highlighting without <?php Markdown Fenced PHP Code Block is Not Syntax Highlighted
8 participants