Need help, please - regular expressions
- 
 Can not cope with it: 
 example source:<img lorem-ipsum-dolor=“/lorem/ipsum/dolor-2015-and/123456789012345/lorem_ipsum/1a2b3c4dd9651/a23w34m87.jpg” class=“lorem123” src=“11LOREM%202%20%20IpSuM%20Dolor%20sit%20%20amet%20consecteur%20-%20AdiPISCIng%20123456%20elit%20Curabitur%20QWERTY%20202020%20yes%20urna%20Interdeum%20%20Off%20Cras_files/a01b02c68.png” alt=“a01b02c68.bmp” height=“101” width=“102”> regex search: (?:<img lorem-ipsum-dolor)(?:.?)( class=|)(".?“)( src=)(?:.?)(?:_files/)(.?.[jpg|png|tif|gif]”) replace: <img\1\2\3"\4 and I get this: <img"/lorem/ipsum/dolor-2015-and/123456789012345/lorem_ipsum/1a2b3c4dd9651/a23w34m87.jpg" class=“lorem123” src=“a01b02c68.png” alt=“a01b02c68.bmp” height=“101” width=“102”> This should not be: “/lorem/ipsum/dolor-2015-and/123456789012345/lorem_ipsum/1a2b3c4dd9651/a23w34m87.jpg” It should look like: <img class=“lorem123” src=“a01b02c68.png” alt=“a01b02c68.bmp” height=“101” width=“102”> What am I doing wrong? 
- 
 at the top cuts stars, sorry regex search: 
 (?:<img lorem-ipsum-dolor)(?:.*?)( class=|)(“.*?”)( src=)(?:.*?)(?:_files/)(.*?.[jpg|png|tif|gif]")
- 
 Very close. You had one minor issue. This: ( class=|)Should be ( class=)
- 
 Also, take a look at your original expression here. You can see the regular expression is allowing it to skip class=.Note: that website might not use the exact same regular expression engine but should be close enough to reference. 
- 
 Shortening again takes revenge, i want shortly, came badly. 
 Expression is also fit to:<img lorem-ipsum-dolor=“/lorem/ipsum/dolor-1999-and/123456789012345/lorem_ipsum/1a2b3c4dd9651/c011XX001.tif” src=“11LOREM%202%20%20IpSuM%20Dolor%20sit%20%20amet%20consecteur%20-%20AdiPISCIng%20123456%20elit%20Curabitur%20QWERTY%20202020%20yes%20urna%20Interdeum%20%20Off%20Cras_files/c01vv0x01.jpg” alt=“c01vv0x01.jpeg” height=“567” width=“789”> IMHO ( class=|) It meant with it, or without it. I emphasize IMHO. 
 Without it, not checked second expression. Because | means OR, right?
 And so it looks, sorry…
- 
 You are right. ( class=|)can skip it. If you look at the image I linked, if it skips group 1, it must still “consume” something for group 2. So you would want a regular expression that only captures group 2 if 1 exists.So this section in your expression: ( class=|)(".*?")Should be replaced with something like: (?:( class=)(".*?"))?Note this isn’t perfect but should point you in the right direction. 
- 
 In the picture looks great. In real it’s always left on the front :( 
- 
 (?:<img lorem-ipsum-dolor)(?:.*?)(?>( class=|)(“.*?”))?(“.*?”)( src=)(?:.*?)(?:_files/)(.*?[jpg|png|tif|gif]") atomic group, works 
 Thx for right direction!
- 
 I’m sorry I was blind doesn’t work 
- 
 Finally 
 <code>(?:<img lorem-ipsum-dolor)(?:.*?)(?:( class=)(“.*?”)( src=)|( src=))(?:.*?)(?:_files/)(.*?[jpg|png|tif|gif]")</code>ps. I don’t know why me didn’t show in red :) 
- 
 Hello Jan and Dail, Jan, I didn’t try to consider your regex S/R, first, trying to fully understand it. I just notice two points : - After copying your example source, in my Notepad++, the delimiters of the different tags ( class, src, alt, height and width ) are the couple “.....”, that is to say the Left DOUBLE quotation MARK ( of Unicode code-point\x{201c}) and the Right DOUBLE quotation MARK ( of Unicode code-point\x201d). These characters are different from the usual QUOTATION MARK"(\x22)
 Therefore, the regex, proposed below, is based on these two characters \x{201c}and\x{201d}- Seemingly, your pictures files can have the .jpg, .png, .tiff or .gif extension. Well, but the regex you use to match these extensions ( [jpg|png|tif|gif]) is totally WRONG, because the|symbol is taken literally, between square brackets !. Indeed, this syntax is a single range of characters, which matches an unique character, which can be the pipe symbol (|), OR one of the letters j, p, g, n, t, i, f, whatever their case. In other words, this subset, of your entire regex, could be simply rewritten[fgijnpt|]
 So, the correct regex is simply (jpg|png|tiff|gif): one extension, among the four possible ones !
 Then, I propose the following regex S/R, below : SEARCH (?i-s).*?(class=“.*?” src=“).*?_files/(.*?(jpg|png|tiff|gif))( with a space, before the tag src )REPLACE <img \L\1\2Notes : - 
The two modifiers (?i-s)forces matches, in an insensitive way and that dot matches standard characters only. In replacement, however, the two groups\1and\2are rewritten, in lower case, due to the\Lsyntax
- 
The four forms .*?represents the shortest list of characters, before each string, located after .*?
- 
All text, before the first string class, of a line, NOT located between round brackets, is therefore deleted, after replacement 
- 
The group \1is the string class=“…” src=“ and the group\2is the name of the picture, with its extension. They, both, are rewritten, in lower case, after an initial <img string.
 If your really need that the line begins with the string <img lorem-ipsum-dolor, just change the search regex into : SEARCH (?i-s)<img lorem-ipsum-dolor.*?(class=“.*?” src=“).*?_files/(.*?(jpg|png|tiff|gif))Best Regards, guy038 
- After copying your example source, in my Notepad++, the delimiters of the different tags ( class, src, alt, height and width ) are the couple 
- 
 Thank you very much for analysis. I know, abnormal brackets, do not have the right to work. But in this specific example, work. 
 Example:
 <img lorem-ipsum-dolor="/lorem/ipsum/dolor-2015-and/123456789012345/lorem_ipsum/1a2b3c4dd9651/a23w34m87.jpg" class="lorem123" src="11LOREM%202%20%20IpSuM%20Dolor%20sit%20%20amet%20consecteur%20-%20AdiPISCIng%20123456%20elit%20Curabitur%20QWERTY%20202020%20yes%20urna%20Interdeum%20%20Off%20Cras_files/a01b02c68.png" alt="a01b02c68.bmp" height="101" width="102"><img lorem-ipsum-dolor="/lorem/ipsum/dolor-1999-and/123456789012345/lorem_ipsum/1a2b3c4dd9651/c011XX001.tif" src="11LOREM%202%20%20IpSuM%20Dolor%20sit%20%20amet%20consecteur%20-%20AdiPISCIng%20123456%20elit%20Curabitur%20QWERTY%20202020%20yes%20urna%20Interdeum%20%20Off%20Cras_files/c01vv0x01.jpg" alt="c01vv0x01.jpeg" height="567" width="789">Regex: 
 (?:<img lorem-ipsum-dolor)(?:.*?)(?:( class=)(".*?")( src=)|( src=))(?:.*?)(?:_files\/)(.*?[jpg|png|tif|gif]")
 replace:
 <img\1\2\3\4"\5After changing to the correct brackets, also works: 
 (?:<img lorem-ipsum-dolor)(?:.*?)(?:( class=)(".*?")( src=)|( src=))(?:.*?)(?:_files\/)(.*?(jpg|png|tif|gif)")
 but then there are 6 groups, the sixth just do not need to call.With all due respect, your as much as possible correct regex is not working. Very sorry for my English, still I am learning. 
- 
 Hi Jan, OK. I, now, understood two main points, about your problem : - 
Firstly, the values of the different tags are surrounded by the usual quotation mark ( "), of Unicode code-point\x{0022}. Of course, my previous regex, based on the two delimiters\x{201c}and\x{201d}, COULDN’T work at all !
- 
Secondly, the tag class="........"may, sometimes, be absent, in a line. Again, my previous regex supposed that this tag was always present:-((
 So, aware of the two facts, above, my new proposed regex is : SEARCH (?i-s)<img lorem-ipsum-dolor.*?((?:class=".*?" )?src=").*?_files/(.*?(jpg|png|tiff|gif))REPLACE <img \L\1\2After running your S/R and mine, they, both, give the same results :-)) Nice ! 
 NOTES : Compared to my previous try : - 
I changed the special delimiters “.....”, by the usual ones".....", in the search regex
- 
I added a new non-capturing group (?:class=".*?" )?, that can exists or NOT, due to the final question mark?
- 
There a space, ending the non-capturing group, before the ending round bracket 
- 
The replacement regex has NOT changed 
 Cheers, guy038 
- 
- 
 Thank you for your commitment 
 Best regards,
 Jan
