WIP: Add "whole word" option to search/query config #14
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "search-whole-word"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
It would be useful to have a "whole word" option for searches. This is particularly useful when replacing, to ensure that parts of words aren't unintentionally replaced.
insideWordandinsideWordBoundariesto a separate module, wrapped inisWholeWord.wholeWordboolean option to thesearchextension and theSearchQueryclass.isValidMatchfunction, used to filter matches (currently only using thewholeWordoption, but could potentially also be used for a "search within selection" option).statearound instead ofdoc, so the state can be used inside theisValidMatchfunction.This approach seems to work, but it does require creating
state.charCategorizerfor every match. It's an opt-in behaviour, so maybe that's acceptable, but I'm completely open to other ways of doing this.I've gone with an alternative implementation in
790059135d. Does that work for you?That seems to work very nicely 👍
I think we might want to change the definition of "whole word" to be "anything that doesn't have a word character directly outside it, on either side", but that will need a bit of verification, so it's up to you if you think that logic makes sense and would prefer to implement it now.
On closer inspection/testing, there seems to be a bug in the
stringWordTestandregexpWordTestimplementations:github.com/codemirror/search@e2d6ffbf1f/src/search.ts (L168-L179)github.com/codemirror/search@e2d6ffbf1f/src/search.ts (L246-L253)They should be saying "return
trueif the characters directly inside are word characters, and the characters directly outside are non-word characters", i.e. this:Alternatively, for the more straightforward "anything that doesn't have a word character directly outside it, on either side" definition of a whole word, we could use this:
That depends on how you specify this feature. I followed VS Code, which seems to disable the test for matches that don't themselves end/start in word characters.
Matching the behaviour of VS Code sounds good in theory (it's what I was aiming for as well), but there seems to be something not quite right in the logic, as it currently matches anything which starts or ends with a non-word character, even if the other end is in the middle of a word.
See demo in https://nz2134.csb.app/ or try once the latest version of
@codemirror/searchhas been deployed there.That is what I was going for. Are you proposing to only disable the check for sides of the match that don't start/end with a word character?
I think the behaviour of VS Code is almost what's in https://github.com/codemirror/search/pull/14#issuecomment-1234183763, but it matches anything that doesn't have a word character immediately inside and outside an edge of the match.
How does attached patch look?
After a bit more investigation, I think the simplest description of "whole word" might be "does not have a word character on both sides of a boundary"?
I'm interpreting that as "has a non-word character on at least one side of both boundaries", which I think is correct.
Yes, that's what the new grouping of the expression was supposed to make clear -- it only filters them out if either side has both a word character before and a word character after it.
@marijnh This is looking good now, if you wouldn't mind putting out a new release 🙏🏼
I've tagged 6.2.1
Pull request closed