Prevent non-greedy part from consuming the following optional part
I have a regex with a mandatory part, a non-greedy (lazy?) part, an optional part and finally another non-greedy part.
<mandatory><non-greedy><optional><non-greedy>
Implemented as:^mandatory.*?(:?optionalpart)?.*?$
The optionalpart consists of 'a piece to find' and 'a piece to return in a capture group'.
^mandatory.*?(:?findme(matchme))?.*?$
But for some inputs the first non-greedy part consumes characters that the following optional part should match. Is there a way to make the optional part more greedy than the previous non-greedy part?
Example: Find the character after the 2,
, or find an empty string if there is no 2,
but the mandatory part matches.
"Foo: 2,b,1,a,3,c" -> match, $1 = "b"
"Foo: 1,a,2,b,3,c" -> match, $1 = "b"
"Foo: 1,a,3,c,2,b" -> match, $1 = "b"
"Foo: 2,b" -> match, $1 = "b"
"Foo: 1,a,3,c" -> match, $1 = ""
"Fuu: 1,a,2,b,3,c" -> no match.
Attempt 1: ^Foo: .*?(?:2,([a-z]))?.*?$
This fails on the 2nd and 3rd example, returning ""
instead of "2"
.
Attempt 2: ^Foo: .*?(?:2,([a-z])).*?$
This fixes the previous fails, but now fails on the 5th example, not matching.
The part that must be optional is no longer optional.
If it matters, I'm using Java's Pattern class.
--
This was asked before, but there was no satisfactory answer for either of us.
java regex non-greedy
add a comment |
I have a regex with a mandatory part, a non-greedy (lazy?) part, an optional part and finally another non-greedy part.
<mandatory><non-greedy><optional><non-greedy>
Implemented as:^mandatory.*?(:?optionalpart)?.*?$
The optionalpart consists of 'a piece to find' and 'a piece to return in a capture group'.
^mandatory.*?(:?findme(matchme))?.*?$
But for some inputs the first non-greedy part consumes characters that the following optional part should match. Is there a way to make the optional part more greedy than the previous non-greedy part?
Example: Find the character after the 2,
, or find an empty string if there is no 2,
but the mandatory part matches.
"Foo: 2,b,1,a,3,c" -> match, $1 = "b"
"Foo: 1,a,2,b,3,c" -> match, $1 = "b"
"Foo: 1,a,3,c,2,b" -> match, $1 = "b"
"Foo: 2,b" -> match, $1 = "b"
"Foo: 1,a,3,c" -> match, $1 = ""
"Fuu: 1,a,2,b,3,c" -> no match.
Attempt 1: ^Foo: .*?(?:2,([a-z]))?.*?$
This fails on the 2nd and 3rd example, returning ""
instead of "2"
.
Attempt 2: ^Foo: .*?(?:2,([a-z])).*?$
This fixes the previous fails, but now fails on the 5th example, not matching.
The part that must be optional is no longer optional.
If it matters, I'm using Java's Pattern class.
--
This was asked before, but there was no satisfactory answer for either of us.
java regex non-greedy
add a comment |
I have a regex with a mandatory part, a non-greedy (lazy?) part, an optional part and finally another non-greedy part.
<mandatory><non-greedy><optional><non-greedy>
Implemented as:^mandatory.*?(:?optionalpart)?.*?$
The optionalpart consists of 'a piece to find' and 'a piece to return in a capture group'.
^mandatory.*?(:?findme(matchme))?.*?$
But for some inputs the first non-greedy part consumes characters that the following optional part should match. Is there a way to make the optional part more greedy than the previous non-greedy part?
Example: Find the character after the 2,
, or find an empty string if there is no 2,
but the mandatory part matches.
"Foo: 2,b,1,a,3,c" -> match, $1 = "b"
"Foo: 1,a,2,b,3,c" -> match, $1 = "b"
"Foo: 1,a,3,c,2,b" -> match, $1 = "b"
"Foo: 2,b" -> match, $1 = "b"
"Foo: 1,a,3,c" -> match, $1 = ""
"Fuu: 1,a,2,b,3,c" -> no match.
Attempt 1: ^Foo: .*?(?:2,([a-z]))?.*?$
This fails on the 2nd and 3rd example, returning ""
instead of "2"
.
Attempt 2: ^Foo: .*?(?:2,([a-z])).*?$
This fixes the previous fails, but now fails on the 5th example, not matching.
The part that must be optional is no longer optional.
If it matters, I'm using Java's Pattern class.
--
This was asked before, but there was no satisfactory answer for either of us.
java regex non-greedy
I have a regex with a mandatory part, a non-greedy (lazy?) part, an optional part and finally another non-greedy part.
<mandatory><non-greedy><optional><non-greedy>
Implemented as:^mandatory.*?(:?optionalpart)?.*?$
The optionalpart consists of 'a piece to find' and 'a piece to return in a capture group'.
^mandatory.*?(:?findme(matchme))?.*?$
But for some inputs the first non-greedy part consumes characters that the following optional part should match. Is there a way to make the optional part more greedy than the previous non-greedy part?
Example: Find the character after the 2,
, or find an empty string if there is no 2,
but the mandatory part matches.
"Foo: 2,b,1,a,3,c" -> match, $1 = "b"
"Foo: 1,a,2,b,3,c" -> match, $1 = "b"
"Foo: 1,a,3,c,2,b" -> match, $1 = "b"
"Foo: 2,b" -> match, $1 = "b"
"Foo: 1,a,3,c" -> match, $1 = ""
"Fuu: 1,a,2,b,3,c" -> no match.
Attempt 1: ^Foo: .*?(?:2,([a-z]))?.*?$
This fails on the 2nd and 3rd example, returning ""
instead of "2"
.
Attempt 2: ^Foo: .*?(?:2,([a-z])).*?$
This fixes the previous fails, but now fails on the 5th example, not matching.
The part that must be optional is no longer optional.
If it matters, I'm using Java's Pattern class.
--
This was asked before, but there was no satisfactory answer for either of us.
java regex non-greedy
java regex non-greedy
asked Nov 14 '18 at 15:18
Mark JeronimusMark Jeronimus
5,17122136
5,17122136
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Your first regex is very close, you need to move (?:
a bit more to the left to include the .*?
pattern:
^Foo:(?: .*?2,([a-z]))?.*$
^^^
See the regex demo
Details
^
- start of string
Foo:
- some literal text
(?: .*?2,([a-z]))?
- an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:
.*?
- space followed with any 0+ chars other than line break chars, as few as possible
2,
- a literal substring
([a-z])
- Group 1: a lowercase letter
.*
- any 0+ chars other than line break chars (the rest of the string)
$
- end of string.
The general pattern will look like
^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$
1
Wow that was quick, and works perfectly
– Mark Jeronimus
Nov 14 '18 at 15:26
And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?
– Mark Jeronimus
Nov 14 '18 at 15:36
@MarkJeronimus Your^Foo: .*?(?:2,([a-z]))?.*?$
did not work because afterFoo:
with space gets matched,.*?
matches nothing (empty text), then(?:2,([a-z]))?
matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last.*?$
grabs the whole line.
– Wiktor Stribiżew
Nov 14 '18 at 15:47
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53303431%2fprevent-non-greedy-part-from-consuming-the-following-optional-part%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Your first regex is very close, you need to move (?:
a bit more to the left to include the .*?
pattern:
^Foo:(?: .*?2,([a-z]))?.*$
^^^
See the regex demo
Details
^
- start of string
Foo:
- some literal text
(?: .*?2,([a-z]))?
- an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:
.*?
- space followed with any 0+ chars other than line break chars, as few as possible
2,
- a literal substring
([a-z])
- Group 1: a lowercase letter
.*
- any 0+ chars other than line break chars (the rest of the string)
$
- end of string.
The general pattern will look like
^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$
1
Wow that was quick, and works perfectly
– Mark Jeronimus
Nov 14 '18 at 15:26
And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?
– Mark Jeronimus
Nov 14 '18 at 15:36
@MarkJeronimus Your^Foo: .*?(?:2,([a-z]))?.*?$
did not work because afterFoo:
with space gets matched,.*?
matches nothing (empty text), then(?:2,([a-z]))?
matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last.*?$
grabs the whole line.
– Wiktor Stribiżew
Nov 14 '18 at 15:47
add a comment |
Your first regex is very close, you need to move (?:
a bit more to the left to include the .*?
pattern:
^Foo:(?: .*?2,([a-z]))?.*$
^^^
See the regex demo
Details
^
- start of string
Foo:
- some literal text
(?: .*?2,([a-z]))?
- an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:
.*?
- space followed with any 0+ chars other than line break chars, as few as possible
2,
- a literal substring
([a-z])
- Group 1: a lowercase letter
.*
- any 0+ chars other than line break chars (the rest of the string)
$
- end of string.
The general pattern will look like
^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$
1
Wow that was quick, and works perfectly
– Mark Jeronimus
Nov 14 '18 at 15:26
And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?
– Mark Jeronimus
Nov 14 '18 at 15:36
@MarkJeronimus Your^Foo: .*?(?:2,([a-z]))?.*?$
did not work because afterFoo:
with space gets matched,.*?
matches nothing (empty text), then(?:2,([a-z]))?
matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last.*?$
grabs the whole line.
– Wiktor Stribiżew
Nov 14 '18 at 15:47
add a comment |
Your first regex is very close, you need to move (?:
a bit more to the left to include the .*?
pattern:
^Foo:(?: .*?2,([a-z]))?.*$
^^^
See the regex demo
Details
^
- start of string
Foo:
- some literal text
(?: .*?2,([a-z]))?
- an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:
.*?
- space followed with any 0+ chars other than line break chars, as few as possible
2,
- a literal substring
([a-z])
- Group 1: a lowercase letter
.*
- any 0+ chars other than line break chars (the rest of the string)
$
- end of string.
The general pattern will look like
^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$
Your first regex is very close, you need to move (?:
a bit more to the left to include the .*?
pattern:
^Foo:(?: .*?2,([a-z]))?.*$
^^^
See the regex demo
Details
^
- start of string
Foo:
- some literal text
(?: .*?2,([a-z]))?
- an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:
.*?
- space followed with any 0+ chars other than line break chars, as few as possible
2,
- a literal substring
([a-z])
- Group 1: a lowercase letter
.*
- any 0+ chars other than line break chars (the rest of the string)
$
- end of string.
The general pattern will look like
^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$
edited Nov 14 '18 at 15:27
answered Nov 14 '18 at 15:23
Wiktor StribiżewWiktor Stribiżew
316k16134215
316k16134215
1
Wow that was quick, and works perfectly
– Mark Jeronimus
Nov 14 '18 at 15:26
And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?
– Mark Jeronimus
Nov 14 '18 at 15:36
@MarkJeronimus Your^Foo: .*?(?:2,([a-z]))?.*?$
did not work because afterFoo:
with space gets matched,.*?
matches nothing (empty text), then(?:2,([a-z]))?
matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last.*?$
grabs the whole line.
– Wiktor Stribiżew
Nov 14 '18 at 15:47
add a comment |
1
Wow that was quick, and works perfectly
– Mark Jeronimus
Nov 14 '18 at 15:26
And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?
– Mark Jeronimus
Nov 14 '18 at 15:36
@MarkJeronimus Your^Foo: .*?(?:2,([a-z]))?.*?$
did not work because afterFoo:
with space gets matched,.*?
matches nothing (empty text), then(?:2,([a-z]))?
matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last.*?$
grabs the whole line.
– Wiktor Stribiżew
Nov 14 '18 at 15:47
1
1
Wow that was quick, and works perfectly
– Mark Jeronimus
Nov 14 '18 at 15:26
Wow that was quick, and works perfectly
– Mark Jeronimus
Nov 14 '18 at 15:26
And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?
– Mark Jeronimus
Nov 14 '18 at 15:36
And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?
– Mark Jeronimus
Nov 14 '18 at 15:36
@MarkJeronimus Your
^Foo: .*?(?:2,([a-z]))?.*?$
did not work because after Foo:
with space gets matched, .*?
matches nothing (empty text), then (?:2,([a-z]))?
matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last .*?$
grabs the whole line.– Wiktor Stribiżew
Nov 14 '18 at 15:47
@MarkJeronimus Your
^Foo: .*?(?:2,([a-z]))?.*?$
did not work because after Foo:
with space gets matched, .*?
matches nothing (empty text), then (?:2,([a-z]))?
matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last .*?$
grabs the whole line.– Wiktor Stribiżew
Nov 14 '18 at 15:47
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53303431%2fprevent-non-greedy-part-from-consuming-the-following-optional-part%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown