Regex: Add html tags in the lines that doesn't have html tags
-
I have this paragraph. Also, I have a line
I need someone to take me home.that doesn’t have html tags. So, I need to find this line (not others) and frame it between tags<!-- START --> <p class="mb-40px">I may go to cinema</p> I need someone to take me home. <p class="mb-40px">I can love you now</p> <!-- FINAL -->OUTPUT:
<!-- START --> <p class="mb-40px">I may go to cinema</p> <p class="mb-40px">I need someone to take me home.</p> <p class="mb-40px">I can love you now</p> <!-- FINAL -->I don’t know why my regex doesn’t work.
FIND:
^(?!<p class="mb-40px">)(.*?)((?!</p>).)*$REPLACE:
<p class="mb-40px">\2\</p> -
This post is deleted! -
ok, so, I believe, I took a step forward. Seems t work.
FIND:
^(?!<p class="mb-40px">)(([a-zA-Z-].+))((?!</p>).)*$REPLACE BY:
<p class="mb-40px">\2</p>Now, I have to integrate this regex between section:
<!-- START -->and<!-- FINAL -->I will use this generic formula:
(?s)(?-i:REGION-START.+?">|\G(?!^))((?!REGION-FINAL).)*?\KFIND REGEXwill become:
FIND:
(?s)(?-i:<\!-- START -->.+?">|\G(?!^))((?!<\!-- FINAL -->).)*?\K^(?!<p class="mb-40px">)(([a-zA-Z-].+))((?!</p>).)*$REPLACE:
<p class="mb-40px">\2</p>In this case, is not very very good. Something not work too good at this final regex. Maybe @guy038 have a better opinion
-
Hello @robin-cruise and All,
No need to use the generic formula !
Here is my general method :
-
From beginning of current line, I try to find a line which does not contain :
- A string
<!-- START -->at any position of current line
AND - A string
<!-- FINAL -->at any position of current line
AND
( - A tag
<p class="mb-40px">at any position of current line
OR - A tag
</p>at any position of current line
)
- A string
-
Then I select all characters, of current line, which come :
-
After a possible
<p class="mb-40px">tag -
Before a possible
</p>tag
-
So, given this INPUT text, below, with
3lines to change :<!-- START --> <p class="mb-40px">I may go to cinema</p> I need someone to take me home. <p class="mb-40px">I may go to cinema</p> I need someone to take me home.</p> <p class="mb-40px">I may go to cinema</p> <p class="mb-40px">I need someone to take me home. <p class="mb-40px">I can love you now</p> <!-- FINAL -->I use the following regex S/R :
SEARCH
(?-is)^(?!.*<!-- START -->)(?!.*<!-- FINAL -->)(?:(?!.*<p class)|(?!.*</p>))(?:<p class="mb-40px">)?(?|(.+)</p>|(.+))REPLACE
<p class="mb-40px">\1</p>And, after a click on the
Replace Allbutton, I get the expected OUTPUT text :<!-- START --> <p class="mb-40px">I may go to cinema</p> <p class="mb-40px">I need someone to take me home.</p> <p class="mb-40px">I may go to cinema</p> <p class="mb-40px">I need someone to take me home.</p> <p class="mb-40px">I may go to cinema</p> <p class="mb-40px">I need someone to take me home.</p> <p class="mb-40px">I can love you now</p> <!-- FINAL -->
Notes :
-
First, after the usual modifiers, the boundaries which must not be matched
(?!.*<!-- START -->)(?!.*<!-- FINAL -->) -
Then, either, each tag which must not be matched, within a non-capturing group and the alternative
(?:(?!.*<p class)|(?!.*</p>)) -
Now, after a possible
(?:<p class="mb-40px">)?,in a non-capturing group, too, the regex select, either :- All chars before the
</p>tag
OR - All remaining chars of current line
- All chars before the
Remark :
-
Note the special syntax of this non-capturing group
(?|(.+)</p>|(.+)). This allow to define all groups to the same level. Thus, you just need the<p class="mb-40px">\1</p>syntax in the replacement part -
If I had used a normal non-capturing group
(?:(.+)</p>|(.+)), two groups1and2would have been defined !. So the correct replacement regex would have been<p class="mb-40px">\1\2</p>, as these two groups are mutually exclusive !
Best Regards,
guy038
-
-
@guy038 super, thanks.
what should be the generic regex in this case? (because I cannot figure the last part )
(?-is)^(?!.*REGION-START)(?!.*REGION-FINAL)(?:(?!.*<p class)|(?!.*</p>))(?:<p class="mb-40px">)?(?|(.+)</p>|(.+)) -
Hi, @robin-cruise,
You cannot use the generic regex, discussed in the topic :
In order to solve your present goal. Why ?
Well, because that genric regex suppose :
-
First, to match a BSR region, followed with any range of chars, possibly null, different from the ESR region, and, after a
\Kfeature, match the FR region -
Then, match, from current caret position, any range of chars, possibly null, different from the ESR region, and, after a
\Kfeature, match the FR region
But, in your present case, the INPUT lines to modify, like
I need someone to take me home., do not contain the BSR and/or the ESR region. So, how do you think to get these absent regions, in the search regex ??Best Regards,
guy038
-
-
SEARCH:
(?-is)^(?!.*<!-- START -->)(?!.*<!-- FINAL -->)(?:(?!.*<p class)|(?!.*</p>))(?:<p class="mb-40px">)?(?|(.+)</p>|(.+))REPLACE:
<p class="mb-40px">\1</p>Your regex seems to be very good. Except one thing. If, also, I have this code on my html pages, will also change here.
So, I need only to change between section
<!-- START -->and<!-- FINAL --><html lang="en"> <head> <!-- Meta Tags --> <meta charset="utf-8"/> Script type="application/ld+json"> { "@context": "https://schema.org/", "@type": "Product", "name": "10 media farces of big days", "image": "icon.jpg", "description": "horses of Letea Delta Danube successfully saved,", "brand": { "@type": "Brand", "name": "something" }, "sku": "NFL", "gtin8": "NFL", "offers": { "@type": "Offer", "url": "https://something.html", "priceCurrency": "RON", "price": "0", "priceValidUntil": "2022-02-15", "availability": "https://schema.org/OnlineOnly" }, "aggregateRating": { "@type": "AggregateRating", "ratingValue": "5", "bestRating": "5", "ratingCount": "6" }, "review": { "@type": "Review", "reviewRating": { "@type": "Rating", "ratingValue": "5", "bestRating": "5" }, "author": {"@type": "Person", "name": "omehing"}, "publisher": {"@type": "Organization", "name": "omehing"} } } </script> -
Hi, @robin-cruise,
Once and for all, Robin, please, post a complete / exact file, which represents all your data that you need to change !
We cannot work this way, in the future, if you do not provide real examples because regex things are very close to real text !
BR
guy038
-
yes, but also I cannot copy/paste the entire html page. It is a very large html code.
-
Hi, @Robin-cruise
If you don’t mind, just send me your file by e-mail !
Here is my temporary mail address :
BR
guy038
-
Hello @robin-cruise and All,
Ah… OK. Thanks for your attached
HTMLfile with your mail. It’s always easier with a real example ;-))Now, as you just have one
<!-- ARTICOL START -->.......<!-- ARTICOL FINAL -->zone in yourHTMLfile, the simple thing to do is :
-
In search, to look for :
- Any char from the very start of file till the complete
<!-- ARTICOL START -->line
- Any char from the very start of file till the complete
-
OR
- Any char from the
<!-- ARTICOL FINAL -->line till the very end of your file
- Any char from the
-
OR ( Scan of lines between the
<!-- ARTICOL START -->and<!-- ARTICOL FINAL -->boundaries )-
A possible
<p class="mb-40px">tag, beginning the current line -
Followed with a single-line range of characters :
- Till a
</p>tag, ending the current line
- Till a
-
OR
- Till the end of current line
-
-
In replacement, to rewrite :
-
( If scan within the
<!-- ARTICOL START -->.........<!-- ARTICOL FINAL -->zone, so when the group2is defined )-
First, the
<p class="mb-40px">tag, if absent in the INPUT file ( group1not defined ) -
Then all the contents of current line (
$0) -
And, finally, the
</p>tag, if absent in the INPUT file ( group3not defined )
-
-
-
OR
- The two ranges of chars, before the
<!-- ARTICOL START -->, included and after the<!-- ARTICOL FINAL -->boundaries ( which occur when the group2is not defined )
- The two ranges of chars, before the
For instance, from this INPUT file, below :
<!DOCTYPE html> .... bla bla .... blah bla <!-- ARTICOL START --> <p class="mb-40px">I need someone to take me home.</p> I need someone to take me home. I need someone to take me home.</p> <p class="mb-40px">I need someone to take me home. <!-- ARTICOL FINAL --> bla bla .... blah bla .... </html>The following regex S/R :
SEARCH
(?s-i)^.+<!-- ARTICOL START -->\R|<!-- ARTICOL FINAL -->.+|(?-s)^(<p class="mb-40px">)?(?|(.+)(</p>)|(.+))$REPLACE
?2(?1:<p class="mb-40px">)$0(?3:</p>):$0Should give you the expected results :
<!DOCTYPE html> .... bla bla .... blah bla <!-- ARTICOL START --> <p class="mb-40px">I need someone to take me home.</p> <p class="mb-40px">I need someone to take me home.</p> <p class="mb-40px">I need someone to take me home.</p> <p class="mb-40px">I need someone to take me home.</p> <!-- ARTICOL FINAL --> bla bla .... blah bla .... </html>
The message Replace All:
6occurences were replaced is displayed in the status bar :- One for the part between
<!DOCTYPE html>and<!-- ARTICOL START --> - One for each non-empty line between
<!-- ARTICOL START -->and<!-- ARTICOL FINAL -->(4lines ) - One for the part between
<!-- ARTICOL START -->and the very end of file
Note that this final solution does not neeed any look-ahead structure nor the
\Gsyntax or other goodies !!Best Regards,
guy038
-
-
@guy038 said in Regex: Add html tags in the lines that doesn't have html tags:
SEARCH
(?s-i)^.+<!-- ARTICOL START -->\R|<!-- ARTICOL FINAL -->.+|(?-s)^(<p class="mb-40px">)?(?|(.+)(</p>)|(.+))$
REPLACE?2(?1:<p class="mb-40px">)$0(?3:</p>):$0great answer, thank you @guy038
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login