Search++: A work in progress
-
Hello, @coises and All,
- Regarding the Bookmarks :
I’m pretty dumb for not thinking of the N++ native command
Search > Bookmark > Clear All Bookmarksor even better : a right mouse click within theBookmark marginwith the same option !
- Regarding the selection concept :
Many thanks for your explanations. So, if I understand you clearly, we need to transform the selection(s) in
Marked Text, first and then use theFind in Mark Textoption
- In your initial post, near the end of the
Featuressection, you said :
Regular expression searches in Search++ perform a fully Unicode-based search using a customized combination of Boost.Regex and ICU4C. In particular, this produces fewer “surprising” results with Unicode characters above 0xFFFF (including most emoji) and when searching in documents using a DBCS code page (which in Notepad++ can be Chinese, Japanese or Korean files that are in the system default encoding instead of in Unicode).
Then, at the end of the
Quirks and features ...section :The ICU button at the top is there mostly for testing. It uses the regular expression engine built into ICU, which has different syntax than the familiar Boost.Regex engine and does not integrate as well with Scintilla. Replace is not implemented for this search engine, and it only works on Unicode documents. It will probably be removed when Search++ reaches version 1.0, as it really isn’t very useful except as a check on the results from the main Regex engine (since I’ve meddled with the main Regex engine quite a lot, and I haven’t modified the ICU engine in any way).
And later, at the end of the
Missing and Planned Features ...section :I hope to add more features to the regular expression search. The current version is almost identical to the search in Columns++, but presented in what is hopefully a more flexible and user-friendly interface. It should be more accurate for Unicode-derived properties since it uses ICU4C directly instead of working from the home-grown parse of Unicode tables used in Columns++. If I can work out a way, I hope to add Unicode word breaks and more Unicode properties.
So, some questions :
-
When clicking on the
Regexbutton, do we use your Unicode search engine, as inColumns++or is it a mix of theColumns++version andICU -
Oddly, if we choose the
ICUbutton, theReplaceandReplace Allbuttons are not greyed and seem functional, contrary to what you said ?! -
Can you recommend a few websites, speaking about
ICUand theUnicode Word Boundariesspecificity ? -
Presently, when hitting the
ICUbutton, do searches like\p{alphabetic}or\p[XID_Continue}are possible against myTotal_Charsfile of325,590characters ?
TIA for all your answers !
Best Regards,
guy038
-
@Vitalii-Dovgan said in Search++: A work in progress:
Maybe you’ll also consider an alternate UI (in addition to the main one) in a form of a small one-line search panel
Probably not one line, but reasonably compact should be possible. At present you can dock the docking Search++ dialog to the top or bottom instead of the left or right, if you like that better. The layout adapts, but it doesn’t use the full width as well as it could — right now there are only horizontal and vertical layouts, and I need to work out an “ultra-wide” layout that puts all the buttons and check boxes into a single row when the dialog is wide enough. I don’t see any reason that can’t be done, though.
The painful points of the Incremental Search panel are:
- it does not clear the last word or the entire Find What field by Ctrl+Backspace. Instead, a stupid unreadable symbol is inserted. (Yes, I know, Microsoft forces us to write own handler of Ctrl+Backspace in each and every instance of an Edit control, what a shame),
Search++ does that now. (That must be the default command for Ctrl+Backspace in Scintilla, since I did nothing special to make it work. I’ve never used Ctrl+Backspace.)
- it does not forward Ctrl+Tab to the main window, thus not allowing to switch between tabs while in the Incremental Search panel.
I see regular Notepad++ search doesn’t do that either. (It uses Ctrl+Tab to switch dialog tabs, though, so that makes sense.) Search++ doesn’t do it now; I don’t know if it’s possible (particularly from a docked dialog) but I will see if it can be done.
At present you can switch rapidly to the main Notepad++ window with Ctrl+N. If you’ve set a shortcut for Search++ you can then use that to switch back again. I know that’s still extra keystrokes, so I will see if Ctrl+Tab can be forwarded, since it’s not used for anything in Search++.
Thank you for your observations and suggestions!
-
@Coises
It should be possible to forward Ctrl+Tab and Ctrl+Shift+Tab by processing WM_KEYDOWN with VK_TAB in your dialog’s DlgProc similarly to this:
https://github.com/d0vgan/AkelPad-Plugs-QSearch/blob/master/Source/QSearch/QSearchDlg.c#L4569Interestingly, the Right Ctrl key often emulates Ctrl+Alt, so when you verify only the presence of VK_TAB and VK_CONTROL (like in the code mentioned above), this code also works for RightCtrl+Tab which becomes VK_TAB and VK_CONTROL and VK_MENU. (VK_MENU is the Alt key. Unlike the real Alt key that comes under WM_SYSKEYDOWN, the “Ctrl+Alt” from RightCtrl comes under WM_KEYDOWN).
Oh, WM_KEYUP should be handled as well:
https://github.com/d0vgan/AkelPad-Plugs-QSearch/blob/master/Source/QSearch/QSearchDlg.c#L4607 -
@guy038 said in Search++: A work in progress:
So, if I understand you clearly, we need to transform the selection(s) in
Marked Text, first and then use theFind in Mark TextoptionYes; or click the Tools button, open Settings and check Convert selections to marked text before beginning a stepwise search to have Search++ do it automatically. Otherwise, multiple searches that don’t affect the selection (like Count or Find All or Replace All) will work within the selection, but only the first stepwise Find (or the preliminary find in a stepwise Replace) will be constrained to the selection, since after that the original selection will be gone.
- When clicking on the
Regexbutton, do we use your Unicode search engine, as inColumns++or is it a mix of theColumns++version andICU
It’s the Columns++ search engine, except for one thing. Previously I could not figure out how to incorporate ICU4C into the plugin, so for Columns++ I devised a Python program that reads several of the Unicode character data files and writes C++ code that compiles into a gigantic table containing the information I needed. I stumbled on the way to use ICU4C shortly before I began working on Search++; instead of building and using those tables, I go straight to ICU4C for information (questions like, “What is the general category of this character?” or ”Is this a lower case character?”).
It might turn out that this will have an efficiency impact (better or worse? — I don’t know). It should fix some of the errors in Columns++, like
[[:lower:]]missing characters that are lower case but not letters.- Oddly, if we choose the
ICUbutton, theReplaceandReplace Allbuttons are not greyed and seem functional, contrary to what you said ?!
They’re not disabled, but all they do is return the message, “Command not implemented.”
- Can you recommend a few websites, speaking about
ICUand theUnicode Word Boundariesspecificity ?
I don’t really have anything except the Unicode documentation. In my brief testing, the practical effect in English is that words like
can'tare recognized as a single word. Most regular expression engines define a word boundary (\b) in terms of what is a word character (\w). The regular expression engine in ICU lets you do that, but it also provides an option to use Unicode word boundaries to define \b.- Presently, when hitting the
ICUbutton, do searches like\p{alphabetic}or\p[XID_Continue}are possible against myTotal_Charsfile of325,590characters ?
Yes. You can even use things like
\p{script=Greek}. Unfortunately, I haven’t been able to find any place where ICU documents its own regular expression syntax. The regular-expressions.info web site includes ICU among the regex dialects it shows. - When clicking on the
-
@guy038 said in Search++: A work in progress:
I’m a bit annoyed to not be able to clear this panel at any time and that I need to close and re-open a N++ session to that purpose ! Personally, an option in the Tools menu, to clear the Search++ Results panel would be great !
Regarding the search direction :I do appreciate to temporarily reverse the search direction, with native N++ search, by hitting or releasing the Shift key ! Would it be possible to add this functionality to Search++ plugin ?
These features, and some bug fixes, are in version 0.3.
-
Hi, @coises and All,
Thanks for your new
Search++_03release !BTW, with native N++ search, the
Shift + Entershortcut is also available when you choose theRegular expressionsearch mode ( with the condition that theregexBackward4PowerUser="yes"option is present within theconfig.xmlfile. May be, you could allow it as well inSearch++?
I just discovered
ICU’s features, and they’re really impressive ! Over the next few days, I’ll try to list the many Unicode properties accessible throughICU… Another whole new world is opening up to me !! Personally, I think theICUbutton should remain available in future versions !
I ran into a problem while selecting characters. For example :
- Put this small text in new tab
ໜໝໞໟༀ༁༂༃༄༅༆༇༈༉༊་༌།༎༏༐༑༒༓༔༕༖༗༘༙༚༛༜༝༞༟༠༡༢༣༤༥༦༧༨༩༪༫༬༭༮༯༰༱༲༳༴༵༶༷༸༹༺༻༼༽༾༿ཀཁགགྷངཅཆཇཉཊཋཌཌྷཎཏཐདདྷནཔཕབབྷམཙཚཛཛྷཝཞཟའཡརལཤཥསཧཨཀྵཪཫཬཱཱཱིིུུྲྀཷླྀཹེཻོཽཾཿ྄ཱྀྀྂྃ྅྆྇ྈྉྊྋྌྍྎྏྐྑྒྒྷྔྕྖྗྙྚྛྜྜྷྞྟྠྡྡྷྣྤྥྦྦྷྨྩྪྫྫྷྭྮྯྰྱྲླྴྵྶྷྸྐྵྺྻྼ྾྿࿀࿁࿂࿃࿄࿅࿆࿇࿈࿉࿊࿋࿌࿎࿏࿐࿑࿒࿓࿔࿕࿖࿗࿘࿙࿚ကခဂဃငစဆဇဈဉညဋဌ-
Switch to this new tab
-
Run
Plugins > Search++ > Search... -
Select the
ICUbutton -
SEARCH
\p{script=Tibetan} -
Check the
Match caseoption -
Right click on the
Find Allbutton -
Choose the
Select > Select in Whole Documentoption
=> A selection appears with the bottom message
Selected 207 matches- Without doing anything else, I use the
Ctrl + Cshortcut
After opening an other new tab, I was quite surprised that the
207tibetan chars were not pasted, after aCtrl + Voperation ?!Then, I understood that the selection is effective ONLY IF :
-
The
Search++plugin is closed with thexbutton or using theESCkey -
You click again on the
New 1tab, withSearch++not on focus -
You move the
New 1text one lineUporDownwith the▲or▼marks of the vertical scroll bar
@coises, is this behaviour correct ?
Regarding the
Unicode Word boundaries:I had a look to https://www.regular-expressions.info/unicodeboundaries.html#word
I understood that :
-
When
ICUselected and theUnicode word boundariesnot checked, the\bregex, against our tibetan text above, counts46matches -
When
ICUselected and theUnicode word boundarieschecked, the\bregex, against our tibetan text above, counts176matches
Quite different, indeed ! Note that if the
Unicode word boundariesis not checked , the(?w)\bregex would also return176matches. Thus, a leading(?w)forces the use of theUnicode word boundariesoption !Then, reading https://www.regular-expressions.info/unicodeboundaries.html#grapheme, I realized that, presently, the
\bregex cannot identify the different grapheme positions !Would it be possible to add an option for this specific case, or am I asking too much ? I suppose the later is true !!
Best Regards,
guy038
-
@guy038 said in Search++: A work in progress:
Thanks for your new
Search++_03release !Thank you for testing it.
BTW, with native N++ search, the
Shift + Entershortcut is also available when you choose theRegular expressionsearch mode ( with the condition that theregexBackward4PowerUser="yes"option is present within theconfig.xmlfile. May be, you could allow it as well inSearch++?Regex backward… I have my doubts, but I can leave it open as something I might try to make available some day. When I’ve thought about it before, I get caught up trying to define exactly what it means to match regular expressions backward. Regular expressions can match different lengths depending on where they start. Is the previous match the one that ends at the latest possible position? The one that begins at the latest possible position? The last one that would have occurred before the current position if you matched forward repeatedly from the beginning of the text? The one that would result from reversing both the text and the regular expression (but then what do you do with backreferences)?
Shift+Enter is a different problem. Enter doesn’t work to find: since the Find and Replace boxes take multiple lines, they consume the Enter key. You can use Alt+F and Alt+R (the underlined characters on the Find and Replace buttons), but those combinations are a bit awkward. I’ve been thinking of just making Shift+Enter and Ctrl+Enter do the functions on the Find and Replace buttons — I think those would be more natural than Alt+F and Alt+R for most people (including me). But then it isn’t obvious how access to backward should work. Beyond all that, there is no standard Windows mechanism for keyboard-only access to the drop-down menus on split command buttons. Once you can get to the button without clicking it, down arrow works to open the menu; but you can’t get there with Alt+underlined letter: that does the click action. I haven’t figured out a good way to deal with all of the keyboard navigation obstacles yet.
Which is a long way of saying I don’t know which of too many possibilities I will eventually decide must take priority for keyboard actions, so I don’t know what I can/will do in that regard.
Personally, I think the
ICUbutton should remain available in future versions !I’ll probably leave the function there… it might be “hidden” (like a Shift-click on Regex) so it doesn’t confuse people who would probably never use it.
- Choose the
Select > Select in Whole Documentoption
=> A selection appears with the bottom message
Selected 207 matches- Without doing anything else, I use the
Ctrl + Cshortcut
After opening an other new tab, I was quite surprised that the
207tibetan chars were not pasted, after aCtrl + Voperation ?!Then, I understood that the selection is effective ONLY IF :
It’s not that selection isn’t effective, it’s that keyboard focus was still in the Search++ dialog. You have to move focus to the document for the Ctrl+C to work.
You can use Ctrl+N (think “Notepad++”) to return focus to the document, or (as you discovered) click on the tab if you’re using the mouse.
This does make me think I should probably have an option, perhaps enabled by default, to return focus to the document automatically after a select operation, since wanting to copy is probably the most common reason for using select.
(I’ve been bitten by this often enough in Columns++, which works the same way. It’s just so easy to forget that focus is in the dialog, not the document.)
Note that if the
Unicode word boundariesis not checked , the(?w)\bregex would also return176matches. Thus, a leading(?w)forces the use of theUnicode word boundariesoption !Hmmm… I’m not sure what’s happening there.
Then, reading https://www.regular-expressions.info/unicodeboundaries.html#grapheme, I realized that, presently, the
\bregex cannot identify the different grapheme positions !Would it be possible to add an option for this specific case
In both Regex and ICU, \X matches a single grapheme cluster. In Regex, (?=\X) matches a grapheme boundary; that doesn’t work in ICU. (It looks like in ICU, \X actually matches from the current position to the end of a grapheme cluster. In Regex, the match must begin and end on a grapheme cluster boundary. The Boost.Regex logic already worked that way, but I replaced/extended it to use the grapheme break algorithm specified by Unicode.) \X partially works in built-in Notepad++ search, too, but it misses some cases and falls apart entirely outside the BMP.
- Choose the
-
Hi, @Coises and All,
Don’t worry about my request for searching for regular expressions in the opposite direction: I can live without that feature !
Regarding the possibility of changing the
Alt + FandAlt + Rshortcuts to some anothers, I’m not really in favor of it because you would break a very oldWindowsstandard !BTW, when in Plain text and with focus on
Search++, let’s suppose we search for the stringand-
Using the
Alt + Fshorcut does move to the next match and, in addition, the shortcutAlt + Shift + Fdoes move to the previous match Nice, indeed ! -
But, if I decide, now, to use the
Enterkey, it wrongly add a line-break after the wordand, in the Find dialog ofSearch++
So, to my mind, it would be better to put the focus on the Find button as soon as you hit the
Alt+ Fshortcut, in order to use, either, theAlt + FOREnterand theALT+ Shift + FORShift + Entershortcuts !
Regarding my selection problem :
Oh… yes, indeed ! The solution was obvious :
Ctrl + Nto switch focus from theSearch++plugin to nativeN++!You said :
This does make me think I should probably have an option, perhaps enabled by default, to return focus to the document automatically after a select operation, since wanting to copy is probably the most common reason for using select.
Yes indeed; this would make sense !
Note that I personally choose the
Ctrl + Shift + Nshortcut for the commandPlugins > Search++ > Search...Thus, as a summary :
-
When focus on
Notepad++, theCtrl + Shift + Nshortcut opens or puts focus on theSearch++plugin ( User shortcut ) -
When focus on
Search++, theCtrl + Nshortcut puts focus onNotepad++( Search++ shortcut ) -
When focus on
Search++, theCtrl + Oshortcut toogles between the Find and the Replace dialog of **Search++( Search++ shortcut ) -
When focus on
Search++, theCtrl + Hshortcut re-opens or puts focus ontheSearch++ Results`( Search++ shortcut ) -
When focus on
Search++ Results, theCtrl + Oshortcut puts focus on theSearch++`plugin ( Search++ shortcut ) -
When focus on
Search++ Results, theCtrl + Shift + Nshortcut closes theSearch++ Resultspanel and puts focus onNotepad++( Search++ shortcut ) -
When focus on
Search++, theCtrl + Shift + Nshortcut closes theSearch++panel and puts focus onNotepad++( User shortcut )
BTW, @coises, why didn’t you choose a single shortcut ( I’m thinking about
Ctrl + H) to toggle between theSearch++plugin and theSearch++ Results? Native notepad++ just use theF7key to shift the focus, back and forth, between theDocumentwindow and theSearch resultspanel !
Regarding the
Unicode Word Boundaries:When I said :
Note that if the
Unicode word boundariesis not checked , the(?w)\bregex would also return176matches. Thus, a leading(?w)forces the use of theUnicode word boundariesoption !There’s nothing weird about this assertion. It just means that the behavior of the
(?w)and(?-w)modifiers act in the same way as the well-known(?s)and(?-s)modifiers which set / unset the. matches new-lineoption, whatever this option is physically checked or not, in nativeN++search !
- Regarding the
Grapheme Boundaries:
No need to add any option ! I’ve just realized that, in
regexmode, the simple regex(?!\X).does match any character which is not aGrapheme-Basechar. Thus, aCountaction would detect, for instance, the total number of accentuated characters associated to a simple latin letter !
I still need to explore all of
Search++'s features and, most importantly, to compile a list of the various properties available with theICUregular expression engine.BR
guy038
-
-
Hello, @coises and All,
Two more points :
- Open the
change.logfile
Let’s suppose that the N++ Find dialog is already opened and that the Find field contains the text
This is a test-
Now, switch back to the
change.logfile -
Select the
Updater (Installer only):text -
Use the
Ctrl + Fshortcut
=> The previous text is updated to the new text to search :
Updater (Installer only):=> OK-
Now, open
Search++( withPlugins > Search++ > Search...or with my shortcutCtrl + Shift + N) -
Type in
This is a testin the Find dialog ofSearch++ -
Click on the
change.logtab -
Select again the
Updater (Installer only):text -
Put the focus again on the
Search++plugin ( withPlugins > Search++ > Search...or with my shortcutCtrl + Shift + N)
=> The text is not uptaded and remains the string
This is a test! To get it updated, you need to close and re-openSearch++Could you provide this N++ search behavior to
Search++, as well ?
When the
Search++dialog is docked, it’s very easy to identify if focus is onNotepad++or onSearch++, thanks to the blue color of the title bar. However, this difference is not so obvious when theSearch++plugin is not docked ! Is there a mean to improve this difference ?Best Regards,
guy038
- Open the
-
@guy038 said in Search++: A work in progress:
Regarding the possibility of changing the
Alt + FandAlt + Rshortcuts to some anothers, I’m not really in favor of it because you would break a very oldWindowsstandard !True. I would not change them; I was thinking about adding Shift+Enter and Ctrl+Enter as alternatives.
- Using the
Alt + Fshorcut does move to the next match and, in addition, the shortcutAlt + Shift + Fdoes move to the previous match Nice, indeed !
That was a “bonus.” It hadn’t occurred to me that Shift would work that way, though now that you mention it, I can see why it does.
- But, if I decide, now, to use the
Enterkey, it wrongly add a line-break after the wordand, in the Find dialog ofSearch++
So, to my mind, it would be better to put the focus on the Find button as soon as you hit the
Alt+ Fshortcut, in order to use, either, theAlt + FOREnterand theALT+ Shift + FORShift + Entershortcuts !The Windows dialog manager doesn’t normally move focus for an Alt+ shortcut to a command button, it just does the command and leaves the focus unchanged. I could probably find a way to override that, but it’s not clear to me that I should. If you started using Alt+F, why not continue that way?
This does make me think I should probably have an option, perhaps enabled by default, to return focus to the document automatically after a select operation, since wanting to copy is probably the most common reason for using select.
Yes indeed; this would make sense !
It will be in the next release.
BTW, @coises, why didn’t you choose a single shortcut ( I’m thinking about
Ctrl + H) to toggle between theSearch++plugin and theSearch++ Results? Native notepad++ just use theF7key to shift the focus, back and forth, between theDocumentwindow and theSearch resultspanel !I suppose I was thinking that since I can’t really make one command that handles all the focus changes — because I don’t know what the user will assign for Search++/Search in the Notepad++ shortcut mapper (if anything at all) — I would have Ctrl+N always go to the document, Ctrl+O always go to the Search dialog (though it’s Find Box unless you’re already in the Find Box, in which case it’s the Replace Box…) and Ctrl+H always go to the results (“hit list”).
It would make at least as much sense to have Ctrl+H toggle between the search dialog and the results list, though. Since you mention it, I’ll probably make that change.
There’s nothing weird about this assertion. It just means that the behavior of the
(?w)and(?-w)modifiers act in the same wayOf course, you are correct.
- Using the
-
@guy038 said in Search++: A work in progress:
- Put the focus again on the
Search++plugin ( withPlugins > Search++ > Search...or with my shortcutCtrl + Shift + N)
=> The text is not uptaded and remains the string
This is a test! To get it updated, you need to close and re-openSearch++Could you provide this N++ search behavior to
Search++, as well ?I did not realize the native dialog worked that way; I assumed it only filled when the dialog wasn’t already open.
I don’t see a reason I couldn’t do this, if it’s what people will ordinarily expect. I would have thought being able to change focus back to the dialog by keyboard without disturbing the contents of the Find box would have been more desirable, but maybe not. Users can always disable the fill option, or I could add an additional setting to let the user decide whether filling only occurs when the dialog wasn’t already open.
Ctrl+I will insert the text selected in the document into the Find (or Replace) box at any time. Since it is an insert, you have to do Ctrl+A, Ctrl+I if you want to replace.
When the
Search++dialog is docked, it’s very easy to identify if focus is onNotepad++or onSearch++, thanks to the blue color of the title bar. However, this difference is not so obvious when theSearch++plugin is not docked ! Is there a mean to improve this difference ?Agreed, the difference is fairly subtle in light mode: the text in the title bar goes grey, and the shadow around the dialog gets a bit less. It’s a bit more visible in dark mode, where the whole title bar changes color.
I have to think about whether there is anything I can do that would make this more visually apparent without being garish or peculiar.
(Note: at present, the Search++ dialog does not fully accommodate changing between dark and light mode within a single Notepad++ session. The Scintilla controls in the regular dialog and in the docking dialog stay however they were when you first opened that type of dialog.)
- Put the focus again on the