-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better support non-prefix (also called flex) completions #651
Comments
Perhaps a 80% solution would be to compute the longest common subsequence between the filter text and the word at point, on the client? From my reading of the protocol, it seems that at least sometimes the client is responsible for filtering completions, so this logic is necessary on the client anyway. In particular a server can respond with |
Fine as long as the client and server can know via LSP what kind of filtering is to be done. But it's harder to extend the protocol in that direction. It's easier for the server to decide what kinds of completion it can perform on the thing at point and hint about how it arrived at those via LSP. Then the client can relay that hint, visually to the user. |
Yes, sometimes the client is responsible for filtering. All the more reason why its currently kind of a bad situation that the client doesn't really know what part of the document the completions are supposed to match (see also long discussions about this in the linked issue #648). Your 80% suggestion, similar to other ones made in that linked issue all boil down to trying to 'guess' what region in the document the completions are supposed to be matched against. This is all well and good if you can actually guess correctly, but it is an inherently flawed and flaky approach. So I'm with @joaotavora on this, and think the server should be given the option to specify somehow what the 'match region' or 'input region' or whatever you want to call it is, rather than expect a client to second guess this information. I probably wouldn't go as far as to make it as detailed as @joaotavora proposes. I.e I don't think we need to go as far as making this a list of regions for a 'flex' match, but rather simply indicate the start and end of the document region that is being matched against. I think that would already be sufficient information. |
@joaotavora if you agree that we don't need the 'full' detailed list of match regions to support flex matches, but that it would be enough to just indicate the start and end of the 'match region' or 'input text' in the document, then perhaps we can close this ticket in favour of #648. |
@kdvolder, well it's not really the same is it? So it would amount to giving up on this request. |
Why not? If the server is lazy to compute the exact subregions and just wants to return a reasonable region, return a single element. Or don't return anything, that's reasonable, too: this is entirely optional. But it should be possible to hint as much as possible to the client what was considered in the match. |
BTW #648 is a different approach, which I believe is harder to fit in the way LSP already works. This one is way easier. The disadvantage here: you're wasting a lot of bandwidth (but is that a problem?) |
I agree it isn't the same. I was only suggesting that maybe it was a 'sufficient' replacement. So perhaps you might indeed be okay dropping this request.
I don't see why. Both are just adding a 'optional' piece of information a server may or may not choose to provide. I think they are both similarly hard in that the require to add new attribute(s) to the completion items.
I don't see how it is easier. To me it looks more complex. Seeing as it adds a more complex piece of information than the other proposals. I'm not fundamentally opposed to your idea however. I just thought there would be a better chance of actually solving the problem, that both these issues seem to be touching on, if discussion could be focussed on the other issue (were most people seem to be gravitating towards). But if you don't feel like 'giving up' on this ticket, it is entirely up to you. |
Right, that part is true. But if the client was to pass the "text to be completed" to the server, the server could not ignore it could it? The semantics of passing that would be "give me all that matches this text, and don't give me anything that doesn't match this text". So while it would be optional for the clients, it would not be optional for the server. Err, never mind, the server could just say it doesn't support this kind of thing in its capabilities. So forget the argument in the previous paragraph. Well anyway, you're right, but it's still true that it is inherently less powerful. If we did your proposal, either we would have to additionally specify something like |
Anyway in #648, someone cleverly mentions that using a |
Actually this is exactly what vscode does. Not sure if it is 'clever'. It is really also just a 'heuristic'. The fact that the edit range usually coincides with the 'input range' is really nothing but a happy coincidence. I have a real-world example (microsoft/vscode#26096) where that particular 'heuristic' failed us and vscode fixed it by making the heuristic a bit more complex and add special case logic when the edit range starts with some leading whitespace. I also posted this in the other ticket by the way. |
The heuristic (the guessing) is client-side, right?
Can't the server make sure that it always coincides?
Yeah I read it. I didn't understand though, but I believe you :-) |
Right. Typically it works like this:
I think the problem you stumbled upon, is, when you are the client, how do you determine which pieces do you 'bolden' in the items? Typically you want to highlight the bits that match the 'input text'. (This also plays a part in incremental filtering and proposal sorting). The client can't do the proper highighting if it doesn't know what items are supposed to match against.
Usually, yes. But not always. Because the range has a meaning other than 'this is the input text'. It means 'this text must be replaced by the edit'. These usually coincide, or can be made to. But there's really no fundamental reason why they should be the same.
Its long and complex :-). I'll try explain it. Basically... our proposals wanted to 'de-dent' the result and so it must delete some spaces in front of the 'input text'. So that really dictates exactly what our edit range should be (or the result would be incorrect). And unfortunately the edit range is then not equal to the input text used for finding the completions. For example. You have this in the document (
We want the result to be:
To make the 'foobar' completion line up correctly with
Ie. the edit range includes the spaces in front of 'foo' not just the word 'foo' itself so the heuristic incorrectly assumes completions must be matched against the text |
@kdvolder I believe that what you are doing thee is fundamentally not what a completion should do. Completion should complete the word. A formatted or indentation system should handle that. If you really needed to do it for some reason, use an additionalTextEdit. It just seems completely wrong to me that selecting a completion string changes the text around what I completed. |
@puremourning, I would agree, but then the spec has Anyway I'd just like to summarize the state of discussion here: a. providing input text to the server, as #648 suggests, would solve 80% of this problem: flex completions. It is possible that with the current b. providing the input text and an identifier for a completion method that both the server and the client know ( c. not providing input text but having the server return a formatted hint as to why each completion matches (like an array of characters indexes) would also solve this problem 100%. My preferred method is c because there's less chance that the client and server disagree on what exactly is I don't have any idea if b and c solve any of @kdvolder's problems. |
@puremourning @joaotavora While that does make sense, there is unfortunately an actual issue with using the |
I have to agree with the clangd guys in this one. An additionalTextEdit that fixes the . To a -> seems a legit use case for that. |
I really don't want to get into an argument what a completion should or shouldn't do. At the end of the day you are trying to make things convenient for a user and users find it annoying if the inserted text is at the wrong indentation level. This is all the more so because we are doing a editor for various yaml DSL and its not just a matter of indentation being 'ugly' but actually differences in indentation changes the meaning of the inserted text (since in yaml indentation has semantic meaning). We are doing these kinds of things because users have told us that they want it. I don't go and argue with them to try to convince them they are wrong.
That doesn't work because of the restrictions on additional edits (shouldn't touch/overlap) the main edit. So really the choice we have is between
There is no alternative. |
I think the scoring, filtering and highlighting should be all under the client control. In LSP we decided so far to not add any of those to ensure that the behavior around these match the typical client behavior. I do agree that this needs either more clarification or additional attributes to support that feature on the client side. In this regard it is similar to #648 |
I can understand what you mean by "highlighting" under client control. I can more or less understand what you mean by "scoring" under client control. But the most important point to tackIe first: I cannot at all understand "filtering" under client control. Hopefully you could disambiguate using this very simple example: If the client's editor is on "foo" and it requests a completion, how can the server know whether to supply just the symbol "foobar" or also the symbol "frodo"? Do you think: a) that the client should annouce (perhaps globally or alongside each completion request), the completion strategy he is interested in, say b) that the server should by default send only the completions that it thinks are relevant, and decide the completion strategy on its own. How in this case would the client understand what strategy the server used so it can proceed with the highlighting. c) the server should ignore completely ignore "foo" and send the full list "bar", "baz", "foo", "frodo", i.e. all the symbols that it has available (I think this would be a little inefficient). d) something else I'm not grasping. I think a) and b) are valid, but a) is harder to spec because we have to very precisly describe what |
Good question. The language servers I implemented so far follow strategy (c) however I see that this can cause trouble in other clients. I am still the opinion that we should avoid specing filtering since it will block innovation. I would rather say we have We only say problems with languages that have thousands of global symbols with this approach. And here these global symbols never change so we were thinking about an LSP extension to get those symbols once. |
So if I type, say: #include <stdio.h>
#include "big_s_lib.h"
int main (int argc, char* argv[]){
int foobar=42;
int frodo=24;
printf("Hello World %d",foo<POINT/CURSOR HERE> And request a completion there, then you are saying that the server should send completions for all known symbols, including macros, functions, extern variables, etc. not just the ones that it thinks are fit to complete "foo"? If so, I think this is a bad strategy to solve this problem:
Isn't it better to let the server, again optionally, provide a formatted
Maybe I don't understand what you mean: they do change. If I add a function to C it goes on the global namespace and immediately becomes a completion candidate in many other places. |
This is not always practical. For example in the spring.properties editor we built all matches would literally be thousands of property names. So it is rather important to filter them beforehand based on what a user typed. Also filtering is somewhat language dependent. For example for the .properties files, the 'identifiers' contain dots whereas for most languages I gather they would not. So, the 'input region' in the document is selected by the language server based on rules that are rather language specific (e.g. do you include the dots a user typed in the filter or not? |
I must explicitly add my support to what @kdvolder said, @dbaeumer . I'd venture to say it's a very bad idea to spec this. |
Prefix completion is all we get in LSP because there are some servers that send *all* completions everytime. This is horrible, but it's the currently defined behaviour. See microsoft/language-server-protocol#651. * eglot.el (eglot-completion-at-point): Use all-completions.
Prefix completion is all we get in LSP because there are some servers that send *all* completions everytime. This is horrible, but it's the currently defined behaviour. See microsoft/language-server-protocol#651. * eglot.el (eglot-completion-at-point): Use all-completions.
Reworked important parts of eglot-completion-at-point. One of the tasks was to cleanup the nomenclature so it's easier to spot how LSP and Emacs's views of completion techniques differ. When reading this rather long function, remember an "item" is a plist representing the LSP completionItem object, and "proxy" is a propertized string that Emacs's frontends will use to represent that completion. When the completion is close to done, the :exit-function is called, to potentially rework the inserted text so that the final result might be quite different from the proxy (it might be a snippet, or even a suprising text edit). The most important change in this commit reworks the way the completion "bounds" are calculated in the buffer. This is the region that Emacs needs to know that is being targeted for the completion. A server can specify this region by using textEdit-based completions all consistently pointing to the same range. If it does so, Emacs will use that region instead of its own understanding of symbol boundaries (provided by thingatpt.el and syntax tables). To implement server-side completion filtering, the server can also provide a filterText "cookie" in each completion, which, when prefix-matched to the intended region, selects or rejects the completion. Given the feedback in microsoft/language-server-protocol#651, we have no choice but to play along with that inneficient and grotesque strategy to implement flex-style matching. Like ever in LSP, we do so while being backward-compatible to all previously supported behaviour. * eglot.el (eglot-completion-at-point): rework.
Reworked important parts of eglot-completion-at-point. One of the tasks was to cleanup the nomenclature so it's easier to spot how LSP and Emacs's views of completion techniques differ. When reading this rather long function, remember an "item" is a plist representing the LSP completionItem object, and "proxy" is a propertized string that Emacs's frontends will use to represent that completion. When the completion is close to done, the :exit-function is called, to potentially rework the inserted text so that the final result might be quite different from the proxy (it might be a snippet, or even a suprising text edit). The most important change in this commit reworks the way the completion "bounds" are calculated in the buffer. This is the region that Emacs needs to know that is being targeted for the completion. A server can specify this region by using textEdit-based completions all consistently pointing to the same range. If it does so, Emacs will use that region instead of its own understanding of symbol boundaries (provided by thingatpt.el and syntax tables). To implement server-side completion filtering, the server can also provide a filterText "cookie" in each completion, which, when prefix-matched to the intended region, selects or rejects the completion. Given the feedback in microsoft/language-server-protocol#651, we have no choice but to play along with that inneficient and grotesque strategy to implement flex-style matching. Like ever in LSP, we do so while being backward-compatible to all previously supported behaviour. * eglot.el (eglot-completion-at-point): rework.
…efix Prefix completion is all we get in LSP because there are some servers that send *all* completions everytime. This is horrible, but it's the currently defined behaviour. See microsoft/language-server-protocol#651. * eglot.el (eglot-completion-at-point): Use all-completions.
Reworked important parts of eglot-completion-at-point. One of the tasks was to cleanup the nomenclature so it's easier to spot how LSP and Emacs's views of completion techniques differ. When reading this rather long function, remember an "item" is a plist representing the LSP completionItem object, and "proxy" is a propertized string that Emacs's frontends will use to represent that completion. When the completion is close to done, the :exit-function is called, to potentially rework the inserted text so that the final result might be quite different from the proxy (it might be a snippet, or even a suprising text edit). The most important change in this commit reworks the way the completion "bounds" are calculated in the buffer. This is the region that Emacs needs to know that is being targeted for the completion. A server can specify this region by using textEdit-based completions all consistently pointing to the same range. If it does so, Emacs will use that region instead of its own understanding of symbol boundaries (provided by thingatpt.el and syntax tables). To implement server-side completion filtering, the server can also provide a filterText "cookie" in each completion, which, when prefix-matched to the intended region, selects or rejects the completion. Given the feedback in microsoft/language-server-protocol#651, we have no choice but to play along with that inneficient and grotesque strategy to implement flex-style matching. Like ever in LSP, we do so while being backward-compatible to all previously supported behaviour. * eglot.el (eglot-completion-at-point): rework.
…efix Prefix completion is all we get in LSP because there are some servers that send *all* completions everytime. This is horrible, but it's the currently defined behaviour. See microsoft/language-server-protocol#651. * eglot.el (eglot-completion-at-point): Use all-completions.
Reworked important parts of eglot-completion-at-point. One of the tasks was to cleanup the nomenclature so it's easier to spot how LSP and Emacs's views of completion techniques differ. When reading this rather long function, remember an "item" is a plist representing the LSP completionItem object, and "proxy" is a propertized string that Emacs's frontends will use to represent that completion. When the completion is close to done, the :exit-function is called, to potentially rework the inserted text so that the final result might be quite different from the proxy (it might be a snippet, or even a suprising text edit). The most important change in this commit reworks the way the completion "bounds" are calculated in the buffer. This is the region that Emacs needs to know that is being targeted for the completion. A server can specify this region by using textEdit-based completions all consistently pointing to the same range. If it does so, Emacs will use that region instead of its own understanding of symbol boundaries (provided by thingatpt.el and syntax tables). To implement server-side completion filtering, the server can also provide a filterText "cookie" in each completion, which, when prefix-matched to the intended region, selects or rejects the completion. Given the feedback in microsoft/language-server-protocol#651, we have no choice but to play along with that inneficient and grotesque strategy to implement flex-style matching. Like ever in LSP, we do so while being backward-compatible to all previously supported behaviour. * eglot.el (eglot-completion-at-point): rework.
Prefix completion is all we get in LSP because there are some servers that send *all* completions everytime. This is horrible, but it's the currently defined behaviour. See microsoft/language-server-protocol#651. * eglot.el (eglot-completion-at-point): Use all-completions. #319: joaotavora/eglot#319
Reworked important parts of eglot-completion-at-point. One of the tasks was to cleanup the nomenclature so it's easier to spot how LSP and Emacs's views of completion techniques differ. When reading this rather long function, remember an "item" is a plist representing the LSP completionItem object, and "proxy" is a propertized string that Emacs's frontends will use to represent that completion. When the completion is close to done, the :exit-function is called, to potentially rework the inserted text so that the final result might be quite different from the proxy (it might be a snippet, or even a suprising text edit). The most important change in this commit reworks the way the completion "bounds" are calculated in the buffer. This is the region that Emacs needs to know that is being targeted for the completion. A server can specify this region by using textEdit-based completions all consistently pointing to the same range. If it does so, Emacs will use that region instead of its own understanding of symbol boundaries (provided by thingatpt.el and syntax tables). To implement server-side completion filtering, the server can also provide a filterText "cookie" in each completion, which, when prefix-matched to the intended region, selects or rejects the completion. Given the feedback in microsoft/language-server-protocol#651, we have no choice but to play along with that inneficient and grotesque strategy to implement flex-style matching. Like ever in LSP, we do so while being backward-compatible to all previously supported behaviour. * eglot.el (eglot-completion-at-point): rework. #235: joaotavora/eglot#235
Prefix completion is all we get in LSP because there are some servers that send *all* completions everytime. This is horrible, but it's the currently defined behaviour. See microsoft/language-server-protocol#651. * eglot.el (eglot-completion-at-point): Use all-completions. GitHub-reference: per joaotavora/eglot#319
Reworked important parts of eglot-completion-at-point. One of the tasks was to cleanup the nomenclature so it's easier to spot how LSP and Emacs's views of completion techniques differ. When reading this rather long function, remember an "item" is a plist representing the LSP completionItem object, and "proxy" is a propertized string that Emacs's frontends will use to represent that completion. When the completion is close to done, the :exit-function is called, to potentially rework the inserted text so that the final result might be quite different from the proxy (it might be a snippet, or even a suprising text edit). The most important change in this commit reworks the way the completion "bounds" are calculated in the buffer. This is the region that Emacs needs to know that is being targeted for the completion. A server can specify this region by using textEdit-based completions all consistently pointing to the same range. If it does so, Emacs will use that region instead of its own understanding of symbol boundaries (provided by thingatpt.el and syntax tables). To implement server-side completion filtering, the server can also provide a filterText "cookie" in each completion, which, when prefix-matched to the intended region, selects or rejects the completion. Given the feedback in microsoft/language-server-protocol#651, we have no choice but to play along with that inneficient and grotesque strategy to implement flex-style matching. Like ever in LSP, we do so while being backward-compatible to all previously supported behaviour. * eglot.el (eglot-completion-at-point): rework. GitHub-reference: close joaotavora/eglot#235
This is the start of a discussion before submitting a pull request for a protocol extension.
Some editors support what is called substring, non-prefix or sometimes "flex" completion. It means that the substring being completed is not necessarily the prefix of each completion in the result but is somehow related to it. Normally, this relation is expressed in terms of the positions of the characters of the substring in the completion string. Here's a visual example where a using is typing
qload
to get to multiple completions that contain the characters of that substring in different positions.For this to work in LSP, where the substring that the user wants to complete is not at all a part of the request (it could be, but that's a whole different discussion), the server has to provide the regions of each completion that it considers are the parts of that that are already set down. So if the user requests completion somewhere around the word
afoo
in the document and the server decides to provides completionsfarfromsober
andgalaxyfooraway
, it could/should also provide the reasoning behind this choice, so that the user can be made visually aware of it. In this case the numbers(1 3 5 8)
for the first completion and(1 6 7 8)
for the second completion would probably suffice: they are the 0-based indexes of the characters ofafoo
in the completion. Or maybe it can say(1 3 5 8)
for the first and(1 (6 8))
for the second, to save some bandwidth. Note that all of this is indeed consistent with the user wanting to completeafoo
, but it's up to the server to decide whicha
ingalaxyfooraway
it means.It's also desirable that the server criteriously order the completions in the results, perhaps sorting to the top the completions where the match is more closely grouped. Fortunately LSP already already has support for that (though nothing like a "score" indication).
Well, long story short, we need a new field for the
Completion
data type. I propose amatchIndexes
field. Maybe to JSONize the examlples it's easier[1 3 5 8]
for the first example and[1 {:start 6 :end 8}]
for the second example.This isn't coming out of thin air, BTW
The text was updated successfully, but these errors were encountered: