How do I get good error reporting for missing tokens in ANTLR 4?












0















I've written an ANTLR 4 grammar for a C-style programming language and am trying to improve error messages for missing delimiters, such as a missing comma or missing closing parenthesis in a function argument list.



I thought that ANTLR's single token insertion mechanism would accurately detect missing tokens, but instead I often get a "no viable alternative" error. Here is an example:



root
: expr+ EOF
;

expr
: '(' expr ')'
| '(' ')' '->' expr
| ID
;

ID: [a-zA-Z0-9$_]+;

Whitespace
: [ trnf]+ -> skip
;


Parsing input ( -> foo results in:



line 1:2 no viable alternative at input '(->'


Instead I'd like to see missing ')'. What's the recommended way to achieve this?



The only working solution I've found to date is to make all delimiters optional in the grammar and act on their absence in a visitor.
However, this feels like a workaround and has significant drawbacks that I'd rather avoid.










share|improve this question





























    0















    I've written an ANTLR 4 grammar for a C-style programming language and am trying to improve error messages for missing delimiters, such as a missing comma or missing closing parenthesis in a function argument list.



    I thought that ANTLR's single token insertion mechanism would accurately detect missing tokens, but instead I often get a "no viable alternative" error. Here is an example:



    root
    : expr+ EOF
    ;

    expr
    : '(' expr ')'
    | '(' ')' '->' expr
    | ID
    ;

    ID: [a-zA-Z0-9$_]+;

    Whitespace
    : [ trnf]+ -> skip
    ;


    Parsing input ( -> foo results in:



    line 1:2 no viable alternative at input '(->'


    Instead I'd like to see missing ')'. What's the recommended way to achieve this?



    The only working solution I've found to date is to make all delimiters optional in the grammar and act on their absence in a visitor.
    However, this feels like a workaround and has significant drawbacks that I'd rather avoid.










    share|improve this question



























      0












      0








      0








      I've written an ANTLR 4 grammar for a C-style programming language and am trying to improve error messages for missing delimiters, such as a missing comma or missing closing parenthesis in a function argument list.



      I thought that ANTLR's single token insertion mechanism would accurately detect missing tokens, but instead I often get a "no viable alternative" error. Here is an example:



      root
      : expr+ EOF
      ;

      expr
      : '(' expr ')'
      | '(' ')' '->' expr
      | ID
      ;

      ID: [a-zA-Z0-9$_]+;

      Whitespace
      : [ trnf]+ -> skip
      ;


      Parsing input ( -> foo results in:



      line 1:2 no viable alternative at input '(->'


      Instead I'd like to see missing ')'. What's the recommended way to achieve this?



      The only working solution I've found to date is to make all delimiters optional in the grammar and act on their absence in a visitor.
      However, this feels like a workaround and has significant drawbacks that I'd rather avoid.










      share|improve this question
















      I've written an ANTLR 4 grammar for a C-style programming language and am trying to improve error messages for missing delimiters, such as a missing comma or missing closing parenthesis in a function argument list.



      I thought that ANTLR's single token insertion mechanism would accurately detect missing tokens, but instead I often get a "no viable alternative" error. Here is an example:



      root
      : expr+ EOF
      ;

      expr
      : '(' expr ')'
      | '(' ')' '->' expr
      | ID
      ;

      ID: [a-zA-Z0-9$_]+;

      Whitespace
      : [ trnf]+ -> skip
      ;


      Parsing input ( -> foo results in:



      line 1:2 no viable alternative at input '(->'


      Instead I'd like to see missing ')'. What's the recommended way to achieve this?



      The only working solution I've found to date is to make all delimiters optional in the grammar and act on their absence in a visitor.
      However, this feels like a workaround and has significant drawbacks that I'd rather avoid.







      java parsing antlr antlr4






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 15 '18 at 21:27







      Fred Curts

















      asked Nov 15 '18 at 8:22









      Fred CurtsFred Curts

      11




      11
























          1 Answer
          1






          active

          oldest

          votes


















          0














          The first thing to note is that you will get almost exactly the error message you want if you remove the '(' expr ')' rule:



          expr
          : '(' ')' '->' expr
          | ID
          ;


          Error message:



          line 1:2 missing ')' at '->'
          line 1:4 mismatched input '<EOF>' expecting {'(', ID}


          I believe you get errors like "mismatched input X, expecting Y" and "missing X" for LL(1) grammars and no viable alternative when lookahead is required at the current position or something like that.



          So with that in mind, we can try to rewrite your grammar to be LL(1):



          expr
          : '(' ( expr ')' | ')' '->' expr )
          | ID
          ;


          Then the error message becomes:



          line 1:2 mismatched input '->' expecting {'(', ')', ID}


          That's pretty close to what you want.






          share|improve this answer
























          • Thanks for your suggestion. Unfortunately, this refactoring doesn't get me close enough to what I want. In my real-world grammar, the ensuing error message contains many more than three expected tokens, and it isn't obvious at all from that message that a ) is missing. (Another problem that I discovered is that this refactoring is incompatible with ANTLR 4's way of dealing with operator precedence, which seems to require top-level alternatives.)

            – Fred Curts
            Nov 15 '18 at 21:40













          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53315088%2fhow-do-i-get-good-error-reporting-for-missing-tokens-in-antlr-4%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          The first thing to note is that you will get almost exactly the error message you want if you remove the '(' expr ')' rule:



          expr
          : '(' ')' '->' expr
          | ID
          ;


          Error message:



          line 1:2 missing ')' at '->'
          line 1:4 mismatched input '<EOF>' expecting {'(', ID}


          I believe you get errors like "mismatched input X, expecting Y" and "missing X" for LL(1) grammars and no viable alternative when lookahead is required at the current position or something like that.



          So with that in mind, we can try to rewrite your grammar to be LL(1):



          expr
          : '(' ( expr ')' | ')' '->' expr )
          | ID
          ;


          Then the error message becomes:



          line 1:2 mismatched input '->' expecting {'(', ')', ID}


          That's pretty close to what you want.






          share|improve this answer
























          • Thanks for your suggestion. Unfortunately, this refactoring doesn't get me close enough to what I want. In my real-world grammar, the ensuing error message contains many more than three expected tokens, and it isn't obvious at all from that message that a ) is missing. (Another problem that I discovered is that this refactoring is incompatible with ANTLR 4's way of dealing with operator precedence, which seems to require top-level alternatives.)

            – Fred Curts
            Nov 15 '18 at 21:40


















          0














          The first thing to note is that you will get almost exactly the error message you want if you remove the '(' expr ')' rule:



          expr
          : '(' ')' '->' expr
          | ID
          ;


          Error message:



          line 1:2 missing ')' at '->'
          line 1:4 mismatched input '<EOF>' expecting {'(', ID}


          I believe you get errors like "mismatched input X, expecting Y" and "missing X" for LL(1) grammars and no viable alternative when lookahead is required at the current position or something like that.



          So with that in mind, we can try to rewrite your grammar to be LL(1):



          expr
          : '(' ( expr ')' | ')' '->' expr )
          | ID
          ;


          Then the error message becomes:



          line 1:2 mismatched input '->' expecting {'(', ')', ID}


          That's pretty close to what you want.






          share|improve this answer
























          • Thanks for your suggestion. Unfortunately, this refactoring doesn't get me close enough to what I want. In my real-world grammar, the ensuing error message contains many more than three expected tokens, and it isn't obvious at all from that message that a ) is missing. (Another problem that I discovered is that this refactoring is incompatible with ANTLR 4's way of dealing with operator precedence, which seems to require top-level alternatives.)

            – Fred Curts
            Nov 15 '18 at 21:40
















          0












          0








          0







          The first thing to note is that you will get almost exactly the error message you want if you remove the '(' expr ')' rule:



          expr
          : '(' ')' '->' expr
          | ID
          ;


          Error message:



          line 1:2 missing ')' at '->'
          line 1:4 mismatched input '<EOF>' expecting {'(', ID}


          I believe you get errors like "mismatched input X, expecting Y" and "missing X" for LL(1) grammars and no viable alternative when lookahead is required at the current position or something like that.



          So with that in mind, we can try to rewrite your grammar to be LL(1):



          expr
          : '(' ( expr ')' | ')' '->' expr )
          | ID
          ;


          Then the error message becomes:



          line 1:2 mismatched input '->' expecting {'(', ')', ID}


          That's pretty close to what you want.






          share|improve this answer













          The first thing to note is that you will get almost exactly the error message you want if you remove the '(' expr ')' rule:



          expr
          : '(' ')' '->' expr
          | ID
          ;


          Error message:



          line 1:2 missing ')' at '->'
          line 1:4 mismatched input '<EOF>' expecting {'(', ID}


          I believe you get errors like "mismatched input X, expecting Y" and "missing X" for LL(1) grammars and no viable alternative when lookahead is required at the current position or something like that.



          So with that in mind, we can try to rewrite your grammar to be LL(1):



          expr
          : '(' ( expr ')' | ')' '->' expr )
          | ID
          ;


          Then the error message becomes:



          line 1:2 mismatched input '->' expecting {'(', ')', ID}


          That's pretty close to what you want.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 15 '18 at 21:25









          sepp2ksepp2k

          297k38597613




          297k38597613













          • Thanks for your suggestion. Unfortunately, this refactoring doesn't get me close enough to what I want. In my real-world grammar, the ensuing error message contains many more than three expected tokens, and it isn't obvious at all from that message that a ) is missing. (Another problem that I discovered is that this refactoring is incompatible with ANTLR 4's way of dealing with operator precedence, which seems to require top-level alternatives.)

            – Fred Curts
            Nov 15 '18 at 21:40





















          • Thanks for your suggestion. Unfortunately, this refactoring doesn't get me close enough to what I want. In my real-world grammar, the ensuing error message contains many more than three expected tokens, and it isn't obvious at all from that message that a ) is missing. (Another problem that I discovered is that this refactoring is incompatible with ANTLR 4's way of dealing with operator precedence, which seems to require top-level alternatives.)

            – Fred Curts
            Nov 15 '18 at 21:40



















          Thanks for your suggestion. Unfortunately, this refactoring doesn't get me close enough to what I want. In my real-world grammar, the ensuing error message contains many more than three expected tokens, and it isn't obvious at all from that message that a ) is missing. (Another problem that I discovered is that this refactoring is incompatible with ANTLR 4's way of dealing with operator precedence, which seems to require top-level alternatives.)

          – Fred Curts
          Nov 15 '18 at 21:40







          Thanks for your suggestion. Unfortunately, this refactoring doesn't get me close enough to what I want. In my real-world grammar, the ensuing error message contains many more than three expected tokens, and it isn't obvious at all from that message that a ) is missing. (Another problem that I discovered is that this refactoring is incompatible with ANTLR 4's way of dealing with operator precedence, which seems to require top-level alternatives.)

          – Fred Curts
          Nov 15 '18 at 21:40






















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53315088%2fhow-do-i-get-good-error-reporting-for-missing-tokens-in-antlr-4%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          List item for chat from Array inside array React Native

          Thiostrepton

          Caerphilly