Prevent non-greedy part from consuming the following optional part












2















I have a regex with a mandatory part, a non-greedy (lazy?) part, an optional part and finally another non-greedy part.



<mandatory><non-greedy><optional><non-greedy>

Implemented as:
^mandatory.*?(:?optionalpart)?.*?$



The optionalpart consists of 'a piece to find' and 'a piece to return in a capture group'.



^mandatory.*?(:?findme(matchme))?.*?$



But for some inputs the first non-greedy part consumes characters that the following optional part should match. Is there a way to make the optional part more greedy than the previous non-greedy part?





Example: Find the character after the 2,, or find an empty string if there is no 2, but the mandatory part matches.



"Foo: 2,b,1,a,3,c" -> match, $1 = "b"
"Foo: 1,a,2,b,3,c" -> match, $1 = "b"
"Foo: 1,a,3,c,2,b" -> match, $1 = "b"
"Foo: 2,b" -> match, $1 = "b"
"Foo: 1,a,3,c" -> match, $1 = ""
"Fuu: 1,a,2,b,3,c" -> no match.


Attempt 1: ^Foo: .*?(?:2,([a-z]))?.*?$

This fails on the 2nd and 3rd example, returning "" instead of "2".



Attempt 2: ^Foo: .*?(?:2,([a-z])).*?$

This fixes the previous fails, but now fails on the 5th example, not matching.

The part that must be optional is no longer optional.



If it matters, I'm using Java's Pattern class.



--



This was asked before, but there was no satisfactory answer for either of us.










share|improve this question



























    2















    I have a regex with a mandatory part, a non-greedy (lazy?) part, an optional part and finally another non-greedy part.



    <mandatory><non-greedy><optional><non-greedy>

    Implemented as:
    ^mandatory.*?(:?optionalpart)?.*?$



    The optionalpart consists of 'a piece to find' and 'a piece to return in a capture group'.



    ^mandatory.*?(:?findme(matchme))?.*?$



    But for some inputs the first non-greedy part consumes characters that the following optional part should match. Is there a way to make the optional part more greedy than the previous non-greedy part?





    Example: Find the character after the 2,, or find an empty string if there is no 2, but the mandatory part matches.



    "Foo: 2,b,1,a,3,c" -> match, $1 = "b"
    "Foo: 1,a,2,b,3,c" -> match, $1 = "b"
    "Foo: 1,a,3,c,2,b" -> match, $1 = "b"
    "Foo: 2,b" -> match, $1 = "b"
    "Foo: 1,a,3,c" -> match, $1 = ""
    "Fuu: 1,a,2,b,3,c" -> no match.


    Attempt 1: ^Foo: .*?(?:2,([a-z]))?.*?$

    This fails on the 2nd and 3rd example, returning "" instead of "2".



    Attempt 2: ^Foo: .*?(?:2,([a-z])).*?$

    This fixes the previous fails, but now fails on the 5th example, not matching.

    The part that must be optional is no longer optional.



    If it matters, I'm using Java's Pattern class.



    --



    This was asked before, but there was no satisfactory answer for either of us.










    share|improve this question

























      2












      2








      2








      I have a regex with a mandatory part, a non-greedy (lazy?) part, an optional part and finally another non-greedy part.



      <mandatory><non-greedy><optional><non-greedy>

      Implemented as:
      ^mandatory.*?(:?optionalpart)?.*?$



      The optionalpart consists of 'a piece to find' and 'a piece to return in a capture group'.



      ^mandatory.*?(:?findme(matchme))?.*?$



      But for some inputs the first non-greedy part consumes characters that the following optional part should match. Is there a way to make the optional part more greedy than the previous non-greedy part?





      Example: Find the character after the 2,, or find an empty string if there is no 2, but the mandatory part matches.



      "Foo: 2,b,1,a,3,c" -> match, $1 = "b"
      "Foo: 1,a,2,b,3,c" -> match, $1 = "b"
      "Foo: 1,a,3,c,2,b" -> match, $1 = "b"
      "Foo: 2,b" -> match, $1 = "b"
      "Foo: 1,a,3,c" -> match, $1 = ""
      "Fuu: 1,a,2,b,3,c" -> no match.


      Attempt 1: ^Foo: .*?(?:2,([a-z]))?.*?$

      This fails on the 2nd and 3rd example, returning "" instead of "2".



      Attempt 2: ^Foo: .*?(?:2,([a-z])).*?$

      This fixes the previous fails, but now fails on the 5th example, not matching.

      The part that must be optional is no longer optional.



      If it matters, I'm using Java's Pattern class.



      --



      This was asked before, but there was no satisfactory answer for either of us.










      share|improve this question














      I have a regex with a mandatory part, a non-greedy (lazy?) part, an optional part and finally another non-greedy part.



      <mandatory><non-greedy><optional><non-greedy>

      Implemented as:
      ^mandatory.*?(:?optionalpart)?.*?$



      The optionalpart consists of 'a piece to find' and 'a piece to return in a capture group'.



      ^mandatory.*?(:?findme(matchme))?.*?$



      But for some inputs the first non-greedy part consumes characters that the following optional part should match. Is there a way to make the optional part more greedy than the previous non-greedy part?





      Example: Find the character after the 2,, or find an empty string if there is no 2, but the mandatory part matches.



      "Foo: 2,b,1,a,3,c" -> match, $1 = "b"
      "Foo: 1,a,2,b,3,c" -> match, $1 = "b"
      "Foo: 1,a,3,c,2,b" -> match, $1 = "b"
      "Foo: 2,b" -> match, $1 = "b"
      "Foo: 1,a,3,c" -> match, $1 = ""
      "Fuu: 1,a,2,b,3,c" -> no match.


      Attempt 1: ^Foo: .*?(?:2,([a-z]))?.*?$

      This fails on the 2nd and 3rd example, returning "" instead of "2".



      Attempt 2: ^Foo: .*?(?:2,([a-z])).*?$

      This fixes the previous fails, but now fails on the 5th example, not matching.

      The part that must be optional is no longer optional.



      If it matters, I'm using Java's Pattern class.



      --



      This was asked before, but there was no satisfactory answer for either of us.







      java regex non-greedy






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 14 '18 at 15:18









      Mark JeronimusMark Jeronimus

      5,17122136




      5,17122136
























          1 Answer
          1






          active

          oldest

          votes


















          1














          Your first regex is very close, you need to move (?: a bit more to the left to include the .*? pattern:



          ^Foo:(?: .*?2,([a-z]))?.*$
          ^^^


          See the regex demo



          Details





          • ^ - start of string


          • Foo: - some literal text


          • (?: .*?2,([a-z]))? - an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:



            • .*? - space followed with any 0+ chars other than line break chars, as few as possible


            • 2, - a literal substring


            • ([a-z]) - Group 1: a lowercase letter




          • .* - any 0+ chars other than line break chars (the rest of the string)


          • $ - end of string.


          The general pattern will look like



          ^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$





          share|improve this answer





















          • 1





            Wow that was quick, and works perfectly

            – Mark Jeronimus
            Nov 14 '18 at 15:26











          • And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?

            – Mark Jeronimus
            Nov 14 '18 at 15:36













          • @MarkJeronimus Your ^Foo: .*?(?:2,([a-z]))?.*?$ did not work because after Foo: with space gets matched, .*? matches nothing (empty text), then (?:2,([a-z]))? matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last .*?$ grabs the whole line.

            – Wiktor Stribiżew
            Nov 14 '18 at 15:47











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53303431%2fprevent-non-greedy-part-from-consuming-the-following-optional-part%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          Your first regex is very close, you need to move (?: a bit more to the left to include the .*? pattern:



          ^Foo:(?: .*?2,([a-z]))?.*$
          ^^^


          See the regex demo



          Details





          • ^ - start of string


          • Foo: - some literal text


          • (?: .*?2,([a-z]))? - an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:



            • .*? - space followed with any 0+ chars other than line break chars, as few as possible


            • 2, - a literal substring


            • ([a-z]) - Group 1: a lowercase letter




          • .* - any 0+ chars other than line break chars (the rest of the string)


          • $ - end of string.


          The general pattern will look like



          ^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$





          share|improve this answer





















          • 1





            Wow that was quick, and works perfectly

            – Mark Jeronimus
            Nov 14 '18 at 15:26











          • And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?

            – Mark Jeronimus
            Nov 14 '18 at 15:36













          • @MarkJeronimus Your ^Foo: .*?(?:2,([a-z]))?.*?$ did not work because after Foo: with space gets matched, .*? matches nothing (empty text), then (?:2,([a-z]))? matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last .*?$ grabs the whole line.

            – Wiktor Stribiżew
            Nov 14 '18 at 15:47
















          1














          Your first regex is very close, you need to move (?: a bit more to the left to include the .*? pattern:



          ^Foo:(?: .*?2,([a-z]))?.*$
          ^^^


          See the regex demo



          Details





          • ^ - start of string


          • Foo: - some literal text


          • (?: .*?2,([a-z]))? - an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:



            • .*? - space followed with any 0+ chars other than line break chars, as few as possible


            • 2, - a literal substring


            • ([a-z]) - Group 1: a lowercase letter




          • .* - any 0+ chars other than line break chars (the rest of the string)


          • $ - end of string.


          The general pattern will look like



          ^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$





          share|improve this answer





















          • 1





            Wow that was quick, and works perfectly

            – Mark Jeronimus
            Nov 14 '18 at 15:26











          • And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?

            – Mark Jeronimus
            Nov 14 '18 at 15:36













          • @MarkJeronimus Your ^Foo: .*?(?:2,([a-z]))?.*?$ did not work because after Foo: with space gets matched, .*? matches nothing (empty text), then (?:2,([a-z]))? matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last .*?$ grabs the whole line.

            – Wiktor Stribiżew
            Nov 14 '18 at 15:47














          1












          1








          1







          Your first regex is very close, you need to move (?: a bit more to the left to include the .*? pattern:



          ^Foo:(?: .*?2,([a-z]))?.*$
          ^^^


          See the regex demo



          Details





          • ^ - start of string


          • Foo: - some literal text


          • (?: .*?2,([a-z]))? - an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:



            • .*? - space followed with any 0+ chars other than line break chars, as few as possible


            • 2, - a literal substring


            • ([a-z]) - Group 1: a lowercase letter




          • .* - any 0+ chars other than line break chars (the rest of the string)


          • $ - end of string.


          The general pattern will look like



          ^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$





          share|improve this answer















          Your first regex is very close, you need to move (?: a bit more to the left to include the .*? pattern:



          ^Foo:(?: .*?2,([a-z]))?.*$
          ^^^


          See the regex demo



          Details





          • ^ - start of string


          • Foo: - some literal text


          • (?: .*?2,([a-z]))? - an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:



            • .*? - space followed with any 0+ chars other than line break chars, as few as possible


            • 2, - a literal substring


            • ([a-z]) - Group 1: a lowercase letter




          • .* - any 0+ chars other than line break chars (the rest of the string)


          • $ - end of string.


          The general pattern will look like



          ^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 14 '18 at 15:27

























          answered Nov 14 '18 at 15:23









          Wiktor StribiżewWiktor Stribiżew

          316k16134215




          316k16134215








          • 1





            Wow that was quick, and works perfectly

            – Mark Jeronimus
            Nov 14 '18 at 15:26











          • And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?

            – Mark Jeronimus
            Nov 14 '18 at 15:36













          • @MarkJeronimus Your ^Foo: .*?(?:2,([a-z]))?.*?$ did not work because after Foo: with space gets matched, .*? matches nothing (empty text), then (?:2,([a-z]))? matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last .*?$ grabs the whole line.

            – Wiktor Stribiżew
            Nov 14 '18 at 15:47














          • 1





            Wow that was quick, and works perfectly

            – Mark Jeronimus
            Nov 14 '18 at 15:26











          • And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?

            – Mark Jeronimus
            Nov 14 '18 at 15:36













          • @MarkJeronimus Your ^Foo: .*?(?:2,([a-z]))?.*?$ did not work because after Foo: with space gets matched, .*? matches nothing (empty text), then (?:2,([a-z]))? matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last .*?$ grabs the whole line.

            – Wiktor Stribiżew
            Nov 14 '18 at 15:47








          1




          1





          Wow that was quick, and works perfectly

          – Mark Jeronimus
          Nov 14 '18 at 15:26





          Wow that was quick, and works perfectly

          – Mark Jeronimus
          Nov 14 '18 at 15:26













          And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?

          – Mark Jeronimus
          Nov 14 '18 at 15:36







          And then you added explanation. So the optional group is greedy. In that case, why didn't it take priority over the previous non-greedy part in my attempt 1?

          – Mark Jeronimus
          Nov 14 '18 at 15:36















          @MarkJeronimus Your ^Foo: .*?(?:2,([a-z]))?.*?$ did not work because after Foo: with space gets matched, .*? matches nothing (empty text), then (?:2,([a-z]))? matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last .*?$ grabs the whole line.

          – Wiktor Stribiżew
          Nov 14 '18 at 15:47





          @MarkJeronimus Your ^Foo: .*?(?:2,([a-z]))?.*?$ did not work because after Foo: with space gets matched, .*? matches nothing (empty text), then (?:2,([a-z]))? matches nothing (empty text) - NOTE it would match some text if this group pattern immediately followed the space as it does with your String 1) - and then the last .*?$ grabs the whole line.

          – Wiktor Stribiżew
          Nov 14 '18 at 15:47




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53303431%2fprevent-non-greedy-part-from-consuming-the-following-optional-part%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Bressuire

          Vorschmack

          Quarantine