Search++: A work in progress
-
@Vitalii-Dovgan said in Search++: A work in progress:
My next suggestion may ruin one of the initial purposes of Search++, but maybe you’ll find it useful anyway? Here it is: the “icudt78.dll” is a big DLL with a size of 32 MB, and when I think about a portable version of Notepad++, I usually consider a portable and small version of Notepad++. Correspondingly, what do you think about an attempt to load the “icudt78.dll” at runtime and if it is not available then just disable the “ICU” button and allow the rest of the plugin to operate normally?
Regex search also depends on ICU, so I can’t really make it optional.
One of the major “under the hood” changes I made when I built Search++ based on the search in Columns++ was to use ICU for all the Unicode properties instead of the rather hacked-together approach I used in Columns++. The underlying engine in both search in Columns++ and Regex search in Search++ is Boost.Regex, with a considerable amount of customization to make it work well with Scintilla and with Unicode.
I’m really hesitant to reverse that, as the ICU properties are more accurate and easier to update when Unicode releases new versions, while the Columns++ approach is tricky and a bit fragile, and still doesn’t get everything right.
However, as you note, it does increase the size: that one dll is over six times the size of the plugin dll. It also appears that the current version of the ICU4C dll files (at least the pre-compiled ones) don’t work on Windows 7 (and presumably Windows 8).
So… I suppose it’s possible that at some point I’ll try to undo the use of ICU and go back to the cobbled-together approach, but this time get it right. I can’t hold out hope that it will happen soon, nor promise that it will happen at all.
Edit to add:
I included the ICU search mostly for testing. ICU’s native search doesn’t integrate well with Scintilla, and it lacks some of the syntax familiar to Notepad++ users from Boost.Regex. Since I haven’t manipulated its search process at all, results from ICU searches can serve as a check on whether Regex is properly implemented (allowing for expected/intentional differences). I might remove it before Search++ comes out of pre-release status; but @guy038 has expressed hope that I will leave it in, so it will probably remain, perhaps hidden in some way.
-
@Coises said in Search++: A work in progress:
Search++ version 0.5.2 addresses a number of issues:
- Fix display of Settings: Mark style drop-down in dark mode. (@Snabel42: this should fix the problem you described here.)
It does indeed, thank you!
-
Hi,@coises and All,
I’m exploring all the possibilities when you click on the
ICUbutton and, really, you access a new regex world with a lot of new options, all related to UNICODE !In addition, you get some new regex syntaxes which are easier to interpret and allow an almost infinite number of ways to define an Unicode range of characters !
For example, if we’re searching ,with the
Match caseoption checked, in myTotal_Chars.txtfile :-
The search
[[\p{letter}][\p{number}]]is the UNION of the[\p{letter}]and the[\p{Number}]properties. Thus, it returns145,672matches OR1,924matches =147,596matches. It could simply be replaced by theBoostsyntax[\p{letter}]|[\p{number}]. -
The search of
[[:letter:]&&[\p{InArabic}]]is the INTERSECTION of the[:letter:]and the\[\p{InArabic}]]properties. Thus, it returns153matches which is the amount of letter characters in the mainArabicblock. It could be replaced by theBoostsyntax(?=[[:letter:]])\p{InArabic}. It returns145,672AND256=153letters within this mainArabicblock. -
The search of
[[:letter:]--[a-z]]is the ASYMETRIC DIFFERENCE between the[[:letter:]]and[a-z]properties =145,672 - 26which returns145,646characters. It could be replaced by theBoostsyntax(?![a-z])[[:letter:]]which also returns145,646chars.
I’m still looking for all the
ICUproperties returning a valid value, against myTotal_Chars.txtfile ! Be patient some days as it remains about600properties not tested !!Best Regards,
guy038
BTW, @coises, what do you think of my idea regarding the two options,
Replace (then find)andReplace (then wait)? -
-
@guy038 said in Search++: A work in progress:
I’m still looking for all the
ICUproperties returning a valid value, against myTotal_Chars.txtfile ! Be patient some days as it remains about600properties not tested !!I would expect that they will all work. I have avoided tampering with the ICU search, so it can serve as a reference standard for the Regex search as far as Unicode properties.
There are some Boost.Regex constructs that ICU does not support. The ones I know about right now are backtracking control (like (*SKIP)(*FAIL)); \K; and \l and \u as shorthand for [[:lower:]] and [[:upper:]]; there may be others. I don’t expect that I will attempt to modify the ICU syntax in any way.
Offering ICU as a “first class” search (working on documents that aren’t UTF-8, being able to replace as well as find, being reasonably efficient for large documents, etc.) is probably possible, but it will take a lot of work. I’m not at all certain it is a task I will attempt.
On the other hand, once the framework is in place and I feel comfortable taking Search++ out of pre-release status, I do hope to add more of the ICU properties to regular Regex.
BTW, @coises, what do you think of my idea regarding the two options,
Replace (then find)andReplace (then wait)?I see your point that the Replace drop-down has become overwhelming with all the options, especially in Plain search.
I’m now planning to add a toggle item (with a check mark) to the Tools menu; something like Jump to next match after Replace. I’m also planning to add keyboard shortcuts for (most or all of) the Tools menu items. That would make it reasonably easy to toggle between Replace (then find) and Replace (then wait) without complicating the menu so badly. (With the shortcut it would be more keyboard-friendly than the present arrangement.) The next version will include something like that, unless I encounter an unexpected obstacle or think of a better idea.
-
Search++ version 0.5.3 is available:
- Avoid a crash when searching in Marked Text in Open Documents or in Marked Text in Documents in this View and a document has no marked text.
- Fix button menus overlapping the button when the button is near the bottom of the screen. This could cause inadvertent activation of the last menu option.
- Make the Scintilla controls (Find box, Replace box and Results list) change colors when dark mode changes. Fix some visibility problems for caret and found text indicators in dark mode.
- Make Show All on the Tools menu scroll current position or selection into view.
- Add toggles Bookmark lines when marking text and Jump to next match after Replace to Tools menu.
- Add keyboard shortcuts for most Tools menu items.
- Condense the Replace button menu and coordinate with the new Jump to next match after Replace toggle.
Note: The keyboard shortcut assignments on the Tools menu are new and might change in the next version in response to my own and others’ experiences, and/or to accommodate any additional menu entries.
@Lachlanmax: Does the implementation of bookmarks in this version work for you? What changes, if any, would you suggest?
@guy038: Do you think the way I’ve done Replace (with the Jump to next match after Replace toggle and the submenu for opposite jump behavior) works sensibly?
All comments, critiques, suggestions and experiences are most welcome!
-
Hello, @coises and All,
Sorry to begin with some bugs :-((
-
The
Bookmarksfeature does not seem to work at all, even if I close and re-open N++ with theBookmark lines when marking textoption already checked in theSearch++ > Toolsdialog ! -
When focus on
Search++, the two shortcutsCtrl + Shift + YorCtrl + Shift + E( which open a new dialog ) wrongly add the control charENorENQto current text typed in the Find dialog. To this purpose, note that inSettings > Preferences... > Editiing 2I personally did not check thePrevent control character (C0 only) typing into documentoption, in order to be able to include some of them in my texts or posts !- Note also that this bug happens ONLY IF you first use the shortcut. In case that you first open the
Toolsmenu and choose theCopy Marked Text...orSettings...option directly, nothing is added in the Find dialog !
- Note also that this bug happens ONLY IF you first use the shortcut. In case that you first open the
Now, regarding the two ways to do a step-wise replacement :
-
I do like your new option, in the
Toolsmenu, which TOGGLE the replacement type, on the fly, thanks to theCtrl + Jshortcut ( I noted theJforJunp ) -
But then, I suppose that the last line of the Replace dialog (
Do not jump to next matchorJump to next match) and its sub-menu ( to adopt the opposite behavior ) are rather redondant and useless ? What is your feeling about it ?
And, personally, I think that a new setting
Force the 'Do not jump to next match' behavior during theXfirst replacements, with0 < X < 9, would be interesting !Indeed :
-
When that new option would be checked :
-
If current behavior is
Jump to next matchit would force theDo not jump to next matchfor the X first replacements, then all subsequent replacements would follow the current behavior. -
If current behavior is
Do not jump to next matchit would not change anything.
-
-
When that new option would not be checked :
-
If current behavior is
Jump to next match, it would follow this behavior -
If current behavior is
Do not jump to next match, it would follow this behavior
-
Best Regards,
guy038
-
-
Hi, @coises,
@coises, very sorry about the supposed bug with the
Bookmarksfeature ! Finally, I understood that this bug happens ONLY IF you choose theICUregex engine ! If you’re using thePlainorRegexoptions, everything woks as expected ;-)) Is there a limitation to useBookmarkswhen the trueICUregex engine is active ?I even noticed that, like with the native Mark dialog, if you have, both, bookmarks and marked text in current document and that the
Bookmark lines when marking textoption, inTools, is unchecked, aRemove marks and bookmarks from active documentaction does not clear theBookmarks, as expected !BR
guy038
-
@guy038 said in Search++: A work in progress:
- When focus on
Search++, the two shortcutsCtrl + Shift + YorCtrl + Shift + E( which open a new dialog ) wrongly add the control charENorENQto current text typed in the Find dialog.
Thank you for catching that! I see it here, too. I will figure out why (no doubt it’s related to the fact that those two combinations open a dialog) and fix it.
- But then, I suppose that the last line of the Replace dialog (
Do not jump to next matchorJump to next match) and its sub-menu ( to adopt the opposite behavior ) are rather redondant and useless ? What is your feeling about it ?
I think it’s not useless because:
-
If you’re going to choose an option from the drop-down menu and you want opposite behavior, it’s easier to open one menu than two.
-
If you just click, rather than shift+click, the options on the sub-menu don’t change the Jump to next match after Replace setting, they just override it for that one command. So if you like to make a setting and keep it but only occasionally do it differently, you can do that and not have to remember to change the setting back again.
And, personally, I think that a new setting
Force the 'Do not jump to next match' behavior during theXfirst replacements, with0 < X < 9, would be interesting !This sounds to me like it would be confusing to use. Your explanation is clear enough, I don’t mean that it is confusing. But using a counter that’s invisible to the user, the behavior of the same command just changes when it counts to a certain point? I can’t see it. Why would it be likely that people would want the same number of searches without automatic find every time, under all conditions?
If others also say they want this behavior, I won’t refuse to give it a try, but… I can’t say I’m fond of the idea.
I note and admit that convenient keyboard navigation of the button drop-down menu options is sorely needed, and as yet I have no good ideas for how to provide it.
- When focus on
-
@guy038 said in Search++: A work in progress:
@coises, very sorry about the supposed bug with the
Bookmarksfeature ! Finally, I understood that this bug happens ONLY IF you choose theICUregex engine ! If you’re using thePlainorRegexoptions, everything woks as expected ;-)) Is there a limitation to useBookmarkswhen the trueICUregex engine is active ?Nothing to be sorry about — thank you for the observation, and for narrowing down to ICU. I see what’s wrong: it’s a coding error. I’ll fix it.
I even noticed that, like with the native Mark dialog, if you have, both, bookmarks and marked text in current document and that the
Bookmark lines when marking textoption, inTools, is unchecked, aRemove marks and bookmarks from active documentaction does not clear theBookmarks, as expected !If you look closely, when Bookmark lines when marking text is checked, the menu item should say Remove marks and bookmarks from active document; when Bookmark lines when marking text is not checked, that menu item should say Remove marks from active document.
(I’m not sure why I didn’t make the same change to Remove marks from multiple documents…, since it also removes bookmarks as well if Bookmark lines is checked.)
It could be that this is a confusing way to do it. I intended to make the Bookmark lines setting so that when it’s checked, bookmarks and marks “go together” for Mark and Show commands and for Remove marks; but when it’s not checked, Search++ only affects marks and does nothing with bookmarks.
Except that Add selection to marked text doesn’t also add bookmarks regardless of whether Bookmark lines is checked.
This does need refinement. There is certainly some inconsistency. I do want to preserve the ability to manipulate marks without touching bookmarks at all; but which commands should affect bookmarks, and whether I need separate commands for some kinds of bookmark manipulation, remains an open question.
-
Hello, @coises an All,
Regarding the two behaviors of the Replace command :
-
First, I suppose that a special mark/sign/icon in the
Search++title zone, to clearly identify the current behavior of theReplaceaction, would be welcome ! -
Then, the user will choose its desired behavior, simply using the
Ctrl + Jshortcut. -
And forget my idea of a new setting ! In this specific case, the user will begin using the alternate
Replacebehavior first then, after some tries, he would toogle to the usualReplacebehavior !
Now, do we need the additional sub-menu in order to occasionally use the opposite
Replacebehavior, without changing the default settings, as you expressed ?To my mind, after using this opposite behavior ( so, of course, changing the default behavior ), the user would just have to hit the
Ctrl + Jshortcut again to restore the previous default !In addition, I probably forget some edge cases and, anyway, it’s your plugin, not my baby !
BR
guy038
-
-
@guy038 said in Search++: A work in progress:
Regarding the two behaviors of the Replace command :
- First, I suppose that a special mark/sign/icon in the
Search++title zone, to clearly identify the current behavior of theReplaceaction, would be welcome !
It is indicated by the icon on the Replace button. Is there some reason that isn’t enough?
Now, do we need the additional sub-menu in order to occasionally use the opposite
Replacebehavior, without changing the default settings, as you expressed ?I agree that the “opposite jump behavior” sub-menu isn’t strictly necessary, but I don’t think it does any harm, either. To my mind, the commands on that menu are still commands that “belong to” the Replace button. If anything, it’s the Jump to next match after Replace option on the Tools menu that strikes me as logically redundant; but I wanted something to which I could assign that Ctrl+J shortcut for easy switching.
If user experience indicates that the sub-menu creates more confusion than convenience, I’ll remove it.
- First, I suppose that a special mark/sign/icon in the
-
Hello, @coises,
When I said :
- First, I suppose that a special mark/sign/icon in the
Search++title zone, to clearly identify the current behavior of theReplaceaction, would be welcome !
You answered me :
It is indicated by the icon on the Replace button . Is there some reason that isn’t enough?
🡪 replace then jump to a new match forward 🡨 replace then jump to a new match backward 🡪❚ replace and highlight replacement; next click finds a new match forward ❚🡨 replace and highlight replacement; next click finds a new match backwardYou’re certainly younger than me and/or have very good eyes ! I did notice that symbol at right of the
Replacebutton but, it seems a bit tiny !Best Regards,
guy038
P.S. :
Perhaps it would be a good idea to specify, in the manual, that :
-
All sections concerning
regular expressionsandformulaswork correctly when theregexbutton is selected ! -
When the
ICUbutton is selected, you could also point out that the important features, below, are NOT supported :-
The
\Kconstruction -
All the
Backtracking Controlverbs, like(*SKIP)or(*F) -
All the symbolic names, except for
[[:ascii.]] -
The invalid
UTF-8characters, like[[.x80.]]or[[.xff.]] -
The
\land\usyntaxes as shorthand of[[:lowercase letter:]]and[[:uppercase letter:]]( which are the[[:upper:]]and[[:lower:]]equivalents when theRegexbutton is selected ! )
-
- First, I suppose that a special mark/sign/icon in the
-
@guy038 said in Search++: A work in progress:
When I said :
- First, I suppose that a special mark/sign/icon in the
Search++title zone, to clearly identify the current behavior of theReplaceaction, would be welcome !
You answered me :
It is indicated by the icon on the Replace button . Is there some reason that isn’t enough?
🡪 replace then jump to a new match forward 🡨 replace then jump to a new match backward 🡪❚ replace and highlight replacement; next click finds a new match forward ❚🡨 replace and highlight replacement; next click finds a new match backwardYou’re certainly younger than me and/or have very good eyes ! I did notice that symbol at right of the
Replacebutton but, it seems a bit tiny !Thank you for the observation. I consider those symbols important in general (not just for this specific case) because they remind you if you’ve click-selected one of the alternatives from the drop-down menus. Before Search++ can be considered ready for a first “stable” release, I have to make sure they are clearly legible. (I used symbols instead of words because the buttons would have to be much bigger to show the full command names as used in the drop-down menus, and that in turn would make the minimum useful size of the dialog much bigger.)
I’m 68 — I don’t know if that’s younger than you. It’s surely not that my eyes are that good. One of the main things I wanted to accomplish in Search++ was using Scintilla controls for the find and replace text, partly because I’m so tired of struggling to read what I’ve typed into those boxes in Notepad++ and Columns++ searches. (The other main reason was to avoid the complications that arise with line endings and “invisible” characters in the standard Windows controls.)
So I think it’s either that I can see the difference in the two symbols because I know what I’m looking for — after all, I don’t have any trouble with the buttons and check boxes in standard search dialogs, and their font is the same as the one in the find and replace boxes — or they aren’t displaying the same on all systems. (Or both.)
For development purposes, to keep things simple, I’ve used Unicode characters for the symbols on the buttons. That has somewhat limited my choice of symbols, as well as given me no control over the size and weight (aside from finding a different symbol). It could also be the case that they display differently on different systems. Which all means that using Unicode symbol characters is a bad way to do this. At some point I will need to replace them using a different method that will be more complex, but will give me more control.
Are you using a high-dpi monitor, by any chance? At present I do not have one available for testing. I have read various information from Microsoft about it, but information without actual practice tends to turn into gibberish… at this point I don’t think I can adequately predict how this will look on high dpi.
Perhaps it would be a good idea to specify, in the manual, that :
-
All sections concerning
regular expressionsandformulaswork correctly when theregexbutton is selected ! -
When the
ICUbutton is selected, you could also point out that the important features, below, are NOT supported :-
The
\Kconstruction -
All the
Backtracking Controlverbs, like(*SKIP)or(*F) -
All the symbolic names, except for
[[:ascii.]] -
The invalid
UTF-8characters, like[[.x80.]]or[[.xff.]]
-
Indeed, my documentation says very little about the ICU search at present. At some point before a first “stable” release I will either document the ICU search more thoroughly or “hide” it so users won’t stumble on it and be confused by it.
- The
\land\usyntaxes as shorthand of[[:lowercase letter:]]and[[:uppercase letter:]]( which are the[[:upper:]]and[[:lower:]]equivalents when theRegexbutton is selected ! )
A minor note: this is a point in which Search++ Regex differs from Columns++ as a result of changes I made to use ICU as the source of information for Unicode properties instead of the mechanism I cobbled together in Columns++.
In Columns++, \l and [[:lower:]] are equivalent to [[:lowercase letter:]] (or [[:Ll:]], or \p{Ll}) — 2,283 matches in your Total_Chars.txt file.
In Search++ Regex, \l and [[:lower:]] are equivalent to (?-i)[[:lower:]] in ICU — 2,595 matches in Total_Chars.txt.
You can still use [[:lowercase letter:]] (or [[:Ll:]], or \p{Ll}) in Search++ Regex to match the same 2,283 characters as (?-i)[[:lowercase letter:]] in ICU.
Unlike ICU, both Columns++ and Search++ Regex ignore case insensitivity when matching named character classes (including the \l and \u shorthands).
- First, I suppose that a special mark/sign/icon in the