@PeterJones thanks a lot for the nuances. Indeed, I first wondered about the difference from the group indexing starting at 1. Then also about the difference from the quantifier ( {n} where n is an integer >= 1 https://www.regular-expressions.info/refquick.html).
Thanks for the $0 group placeholder mention, I wondered about that too, now I understand what it captures.
I understand the regex as this:
Find:
Put everything that preceeds the occurence of interest into a group (1st group referenced by the placeholder with the starting index at 1 ($1) — though there is a placeholder 0 ($0) which references the whole set/string instead of any subgroup of it)). Exclude the occurence of interest from the that group, but state is a the search delimiter for the regex just outside the group.Replace with:
Capture the group with it’s placeholder (make a copy of it and store it: $1 = foo / ^((?:.?foo){0}.?) for the 1st occurence (N+1) with index 0). Use the 2nd/next occurence as external delimiter reference to stop the regex search at (^((?:.?foo){0}.?)foo). Then append the new value (XOO) to the copied unchanged group.I think I see what you mean when considering there must always be a 2nd /next occurence for the regex to work so it can’t be starting at zero? While in the background the engine uses a zero based indexing for the 1st element of the occurences series.
0 is the 1st element in the indexes series, 1 is the 2nd and so on.
While for the groups placeholders, 0 isn’t an ordinal reference, it’s an arbitrary reference to the set. The ordinal reference starting at 1 in this case.
I need to check the doc and do more practice to get over the confusing parts!
The quantifier also starting at 1 though index 0 is still valid but return no value (or the whole set but with empty values)?
For example:
19 empty string matches:0.gif
[A-Z]{0} goo A greAS gir PEhttps://regex101.com/r/dYnJmE/1
/ [A-Z]{0} / gm Match a single character present in the list below [A-Z] {0} matches the previous token exactly zero times (causes token to be ignored) A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive) Global pattern flags g modifier: global. All matches (don't return after first match) m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string) 0-0 empty string 1-1 empty string 2-2 empty string 3-3 empty string 4-4 empty string 5-5 empty string 6-6 empty string 7-7 empty string 8-8 empty string 9-9 empty string 10-10 empty string 11-11 empty string 12-12 empty string 13-13 empty string 14-14 empty string 15-15 empty string 16-16 empty string 17-17 empty string 18-18 empty string No match/invalid:1.gif
[A-Z]{} goo A greAS gir PEhttps://regex101.com/r/CtqQ0D/1
/ [A-Z]{} / gm Match a single character present in the list below [A-Z] A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive) {} matches the characters {} literally (case sensitive) Global pattern flags g modifier: global. All matches (don't return after first match) m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string) Your regular expression does not match the subject string. 5 matches:2.gif
[A-Z]{1} goo A greAS gir PEhttps://regex101.com/r/MImsNL/1
/ [A-Z]{1} / gm Match a single character present in the list below [A-Z] {1} matches the previous token exactly one time (meaningless quantifier) A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive) Global pattern flags g modifier: global. All matches (don't return after first match) m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string) 4-5 A 9-10 A 10-11 S 16-17 P 17-18 E 2 matches:3.gif
[A-Z]{2} goo A greAS gir PE / [A-Z]{2} / gm Match a single character present in the list below [A-Z] {2} matches the previous token exactly 2 times A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive) Global pattern flags g modifier: global. All matches (don't return after first match) m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string) 9-11 AS 16-18 PE