Generic Regex: Replacing in a specific zone of text
- 
This regex S/R allows to restrict a replacement to a specific zone of text, possibly repeated, on one or several consecutive lines.
This is particularly useful when dealing with
XMLorHTMLlanguages, if you need to do some modifications within a specificstartandendtag range, only.
- 
Let FR (
Find Regex) be the regex which defines the char, string or expression to be searched - 
Let RR (
Replacement Regex) be the regex which defines the char, string or expression which must replace the FR expression - 
Let BSR (
Begin Search-region Regex) be the regex which defines the beginning of the area where the search for FR, must start - 
Let ESR (
End Search-region Regex) be the regex which defines the end of the area where the search for FR must stop 
Then, the generic regex can be expressed :
SEARCH
(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\K(?-si:FR)REPLACE RR
When the BSR and the different matches of the FR regex are all located in a single line, any line-ending char(s) will implicitly break down the
\Gfeature. The ESR part is then useless and the generic regex can be simplified into :SEARCH
(?-s)(?-i:BSR|(?!\A)\G).*?\K(?-i:FR)REPLACE RR
IMPORTANT :
- 
You must use, at least, the
v7.9.1N++ release, so that the\Aassertion is correctly handled - 
You must, move the caret at the very beginning of current file (
Ctrl + Home) - 
If you perform a simple search, without any replacement, just click several times on the
Find Nextbutton to notice the different zones affected by the future replacement - 
As soon as a replacement is needed, you’ll have to click on the
Replace Allbutton, exclusively. Thus, it will perform a global replacement on the entire file 
NOTES :
- 
Each non-capturing group, relative to the BSR, ESR and FR regexes, may be prefixed with the
sor-smodifiers :- 
If the BSR and/or ESR and/or FR regexes may match
EOLcharacters, use thesmodifier in the appropriate non-capturing group(s) - 
If the BSR and/or ESR and/or FR regexes does not match
EOLcharacters, use the-smodifier in the appropriate non-capturing group(s) 
 - 
 - 
Each non-capturing group, relative to the BSR, ESR and FR regexes, may be prefixed with the
ior-imodifiers :- 
If the BSR and/or ESR and/or FR regexes are sensitive to case, use the
-imodifier in the appropriate non-capturing group(s) - 
If the BSR and/or ESR and/or FR regexes are insensitive to case, use the
imodifier in the appropriate non-capturing group(s) 
 - 
 - 
Of course, these modifiers may not be necessary ( for instance in case of search of an exact string or search of non-letter characters )
 - 
Note that the generic regexes, above, show the case when :
- 
These two generic regexes are sensitive to case => The
-imodifier is present everywhere in the definitions - 
The ESR region of the first regex may overlap on several lines => The
smodifier in the ESR non-capturing group 
 - 
 - 
The FR regex may define a group, between parentheses, which will be re-used in the RR regex with the
\#or${#}syntaxes, where#represents an integer - 
The RR regex may contain the
$0syntax which refers to each whole SR match or re-use a group, previously defined in the FR regex 
Below, here are two examples to illustrate how to build real regexes S/R from these generic ones !
First, let’s imagine that you want to delete any part within parentheses in any range of text
<Descrip>............</Descrip>, only, located in a single line- Paste the 
XMLtext, below, in a new tab : 
<iden>123456 (START)</iden> <name>Case_1</name> <descrip>This is a (short) text to (easily) see the results (of the modifications)</descrip> <param>val (250)</param> <iden>123456</iden> <name>Case_2</name> <descrip>And the (obvious) changes occur only in (the) "descrip" tag</descrip> <param>val (500)</param> <iden>123456 (END)</iden> <name>Case_3</name> <descrip>All (the) other tags are (just) untouched</descrip> <param>val (999)</param>- 
As all the parts to delete are contained in a single line, we can use the simplified formulation :
- 
SEARCH
(?-s)(?-i:BSR|(?!\A)\G).*?\K(?-i:FR) - 
REPLACE RR
 
 - 
 - 
Obviously, as we want to delete, the RR regex is a zero-length match. So, the
Replace withfield will be empty - 
Now, the FR regex represents a
spacechar followed by the shortest text between parentheses => FR =(?:\x20\(.+?\))We do not need any case modifier as this regex does not refer to letters ! - 
The BSR regex is simply the literal string
<descrip>, with this exact case. So BSR =(?-i:<descrip> 
Finally, the functional regex S/R to use is :
- 
SEARCH
(?-s)(?-i:<descrip>|(?!\A)\G).*?\K(?:\x20\(.+?\)) - 
REPLACE
Leave EMPTY - 
Open the Replace dialog
Ctrl + H - 
Untick all options
 - 
Select the
Regular expressionsearch mode - 
Move to the very beginning of current file (
Ctrl + Home) - 
Hit several times the
Find Nextbutton to verify if the FR regex does match what you want ! In this present case it matches a space followed by text between parentheses - 
Again, move to the very beginning of current file (
Ctrl + Home) - 
Click, once only, on the
Replace Allbutton 
=> As expected, all text between parentheses, of the
<descrip>tag only, has been deleted, but the other parentheses, present in other tags, are untouched !
In the second example, we’ll try to replace any number of consecutive
dashcharacter with a singlespacechar in any range<text>..........</text>, possibly splitted into several lines- Paste the following 
XMLtext in a new tab 
<val>37--001</val> <text>This-is -a</text> <pos>4-1234</pos> <val>37--002</val> <text>-small---example</text> <pos>9-0012</pos> <val>37--003</val> <text>-of-text- which-</text> <pos>1-9999</pos> <val>37--004</val> <text>need -to-be- modi fied</text> <pos>0-0000</pos>- 
As, this time, the
<text>..........</text>may be spread over several lines, we’ll use the first generic regex :- 
SEARCH
(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\K(?-si:FR) - 
REPLACE RR
 
 - 
 - 
Obviously, the RR regex is simply
\x20 - 
Now, the FR regex represents a non-null number of consecutive dashe(s) => FR is just
-+, as the non-capturing group seems not needed at all - 
The BSR regex is simply the literal string
<text>, with this exact case => BSR =(?-si:<text> - 
The ESR regex is the literal string
</text>, with this exact case. So the BSR regex, within its non-capturing group, is(?s-i:(?!</text>).) 
Then, the real regex S/R to use is :
- 
SEARCH
(?-si:<text>|(?!\A)\G)(?s-i:(?!</text>).)*?\K-+ - 
REPLACE
\x20 - 
Open the Replace dialog
Ctrl + H - 
Untick all options
 - 
Select the
Regular expressionsearch mode - 
Move to the very beginning of current file (
Ctrl + Home) - 
Hit several times the
Find Nextbutton to verify if the FR regex does match what you want ! In this present case it matches any consecutive range ofdashchars - 
Again, move to the very beginning of current file (
Ctrl + Home) - 
Click, once only, on the
Replace Allbutton 
=> As expected, all range of consecutive dashes, of the
<text>tag only, have been replaced with a singlespacechar and the otherdashcharacters, present in other tags, are kepted ! - 
 - 
P PeterJones referenced this topic on
 - 
Two other examples regarding this generic regex ! In these ones, we’ll even restrict the replacements to each concerned zone before a
#character !Paste the text below in a new tab :
<iden>123456 (START)</iden> <name>Case_1</name> <descrip>This is a (short) text to (easily) see the results (of the modifications)# (12345) test (67890)</descrip> <param>val (250)</param> <iden>123456</iden> <name>Case_2</name> <descrip>And the (obvious) changes occur only in (the) "descrip" tag # Parentheses (Yeaah) OK</descrip> <param>val (500)</param> <iden>123456 (END)</iden> <name>Case_3</name> <descrip>All (the) other tags are (just) untouched #(This is) the end (of the test)</descrip> <param>val (999)</param>In this first example, of single-line
<descrip>tags , two solutions are possible :- 
Use the complete generic regex
(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\K(?-si:FR)where ESR =#which leads to the functional S/R :- 
SEARCH
(?-s)(?-i:<descrip>|(?!\A)\G)((?!#).)*?\K(?:\x20\(.+?\)) - 
REPLACE
Leave EMPTY 
 - 
 
=> This time, in addition to only replace in each
<descrip>..........</descrip>zone, NO replacement will occur after the#character of each<descrip>tag !- 
Use the simplified solution and add a ESR condition at the end of the regex, giving this generic variant
(?-s)(?-i:BSR|(?!\A)\G).*?\K(?-i:FR)(?=ESR)- 
SEARCH
(?-s)(?-i:<descrip>|(?!\A)\G).*?\K(?:\x20\(.+?\))(?=.*#) - 
REPLACE
Leave EMPTY 
 - 
 
However, this other solution needs that all the
<descrip>tags contains a comment zone with a#char
Now, paste this other text below in a new tab :
<val>37--001</val> <text>This-is -a--very---< # Dashes - - - OK/text> <pos>4-1234</pos> <val>37--002</val> <text>-small----#---example</text> <pos>9-0012</pos> <val>37--003</val> <text>-of-a-text- which-</text> <pos>1-9999</pos> <val>37--004</val> <text>need -to-be- modi fied # but - not - there</text> <pos>0-0000</pos>This second example is a multi-lines replacement, in each
<text>.............</text>zone only and also limited to the part before a#char which can be present or notOf course, we’ll have to use the complete generic regex
(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\K(?-si:FR)but, instead of a single(?!ESR), we’ll have to use this variant :(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR_1)(?!ESR_2).)*?\K(?-si:FR)So, the functional regex S/R becomes :
- 
SEARCH
(?-si:<text>|(?!\A)\G)(?s-i:(?!</text>)(?!#).)*?\K-+ - 
REPLACE
\x20 
=> ONLY IF a sequence of dashes is located in a
<text>..........</text>zone AND, moreover, before a possible#char, it will be replaced with a singlespacecharacterAs you can verify, the third multi-lines
<text>.............</text>zone does not contain any#char. Thus, all dash characters, of that<Text>tag, are replaced with a singlespacechar !
Remainder :
- 
You must use, at least, the
v7.9.1N++ release, so that the\Aassertion is correctly handled - 
Move to the very beginning of file, before any
Find Nextsequence orReplace Alloperation - 
Do not click on the step-by-step
Replacebutton 
 - 
 - 
P PeterJones referenced this topic on
 - 
P PeterJones referenced this topic on
 - 
P PeterJones referenced this topic on
 - 
A Alan Kilborn referenced this topic on 
 - 
P PeterJones referenced this topic on
 - 
G guy038 referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
P PeterJones referenced this topic on
 - 
P PeterJones referenced this topic on
 - 
L Luigi Giuseppe De Franceschi referenced this topic on
 - 
G guy038 referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
A Alan Kilborn referenced this topic on 
 - 
P PeterJones referenced this topic on
 - 
P PeterJones referenced this topic on
 - 
P PeterJones referenced this topic on
 - 
P PeterJones referenced this topic on
 - 
A Alan Kilborn referenced this topic on 
 - 
A Alan Kilborn referenced this topic on 
 - 
T Terry R referenced this topic on 
 - 
A Alan Kilborn referenced this topic on 
 - 
P PeterJones referenced this topic on
 - 
A Alan Kilborn referenced this topic on 
 - 
A Alan Kilborn referenced this topic on 
 - 
F fenzek1 referenced this topic on 
 - 
T Terry R referenced this topic on 
 - 
P PeterJones referenced this topic on
 - 
P PeterJones referenced this topic on
 - 
A Alan Kilborn referenced this topic on 
 - 
P PeterJones referenced this topic on
 - 
P PeterJones referenced this topic on
 - 
A Alan Kilborn referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
P Paul Wormer referenced this topic on
 - 
P PeterJones referenced this topic on
 - 
M Mark Olson referenced this topic on 
 - 
P PeterJones referenced this topic on
 - 
G guy038 referenced this topic on 
 - 
P Paul Wormer referenced this topic on
 - 
A Alan Kilborn referenced this topic on 
 - 
A Alan Kilborn referenced this topic on 
 - 
T Terry R referenced this topic on 
 - 
P PeterJones referenced this topic on
 - 
D dr ramaanand referenced this topic on
 - 
T Terry R referenced this topic on 
 - 
S Sylvester Bullitt referenced this topic on 
 - 
S Sylvester Bullitt referenced this topic on 
 - 
T Terry R referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
M Mark Olson referenced this topic on 
 - 
A Alan Kilborn referenced this topic on 
 - 
M mkupper referenced this topic on
 - 
G guy038 referenced this topic on 
 - 
C Coises referenced this topic on
 - 
C Coises referenced this topic on
 - 
A Alan Kilborn referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
A Alan Kilborn referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
T Terry R referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
M Mark Olson referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
G guy038 referenced this topic on 
 - 
T Terry R referenced this topic on 
 - 
P PeterJones referenced this topic on