How do I fix/edit this regular expression?











up vote
1
down vote

favorite
1













lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS]
ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG



lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS]
ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA




I want to extract word after locus_tag= (only LBPC_RS14705 and LBPC_RS14710). How do I fix this regular expression?



[locus_tag][=]w+










share|improve this question









New contributor




Glufflix is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • What exactly do you want as your desired output, LBPC_RS14705 or the whole text after it?
    – rv7
    21 hours ago










  • @rv7 Just only LBPC_RS14705 and LBPC_RS14710.
    – Glufflix
    21 hours ago










  • You need a capturing group around your w+, like this
    – rv7
    21 hours ago






  • 1




    @rv7 Thanks a lot! It's very helpful for me.
    – Glufflix
    20 hours ago










  • @Glufflix I updated my answer to retrieve both tags
    – Nick Parsons
    20 hours ago















up vote
1
down vote

favorite
1













lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS]
ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG



lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS]
ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA




I want to extract word after locus_tag= (only LBPC_RS14705 and LBPC_RS14710). How do I fix this regular expression?



[locus_tag][=]w+










share|improve this question









New contributor




Glufflix is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • What exactly do you want as your desired output, LBPC_RS14705 or the whole text after it?
    – rv7
    21 hours ago










  • @rv7 Just only LBPC_RS14705 and LBPC_RS14710.
    – Glufflix
    21 hours ago










  • You need a capturing group around your w+, like this
    – rv7
    21 hours ago






  • 1




    @rv7 Thanks a lot! It's very helpful for me.
    – Glufflix
    20 hours ago










  • @Glufflix I updated my answer to retrieve both tags
    – Nick Parsons
    20 hours ago













up vote
1
down vote

favorite
1









up vote
1
down vote

favorite
1






1






lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS]
ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG



lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS]
ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA




I want to extract word after locus_tag= (only LBPC_RS14705 and LBPC_RS14710). How do I fix this regular expression?



[locus_tag][=]w+










share|improve this question









New contributor




Glufflix is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS]
ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG



lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS]
ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA




I want to extract word after locus_tag= (only LBPC_RS14705 and LBPC_RS14710). How do I fix this regular expression?



[locus_tag][=]w+







javascript regex






share|improve this question









New contributor




Glufflix is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Glufflix is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 20 hours ago









quant

8831825




8831825






New contributor




Glufflix is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 21 hours ago









Glufflix

84




84




New contributor




Glufflix is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Glufflix is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Glufflix is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • What exactly do you want as your desired output, LBPC_RS14705 or the whole text after it?
    – rv7
    21 hours ago










  • @rv7 Just only LBPC_RS14705 and LBPC_RS14710.
    – Glufflix
    21 hours ago










  • You need a capturing group around your w+, like this
    – rv7
    21 hours ago






  • 1




    @rv7 Thanks a lot! It's very helpful for me.
    – Glufflix
    20 hours ago










  • @Glufflix I updated my answer to retrieve both tags
    – Nick Parsons
    20 hours ago


















  • What exactly do you want as your desired output, LBPC_RS14705 or the whole text after it?
    – rv7
    21 hours ago










  • @rv7 Just only LBPC_RS14705 and LBPC_RS14710.
    – Glufflix
    21 hours ago










  • You need a capturing group around your w+, like this
    – rv7
    21 hours ago






  • 1




    @rv7 Thanks a lot! It's very helpful for me.
    – Glufflix
    20 hours ago










  • @Glufflix I updated my answer to retrieve both tags
    – Nick Parsons
    20 hours ago
















What exactly do you want as your desired output, LBPC_RS14705 or the whole text after it?
– rv7
21 hours ago




What exactly do you want as your desired output, LBPC_RS14705 or the whole text after it?
– rv7
21 hours ago












@rv7 Just only LBPC_RS14705 and LBPC_RS14710.
– Glufflix
21 hours ago




@rv7 Just only LBPC_RS14705 and LBPC_RS14710.
– Glufflix
21 hours ago












You need a capturing group around your w+, like this
– rv7
21 hours ago




You need a capturing group around your w+, like this
– rv7
21 hours ago




1




1




@rv7 Thanks a lot! It's very helpful for me.
– Glufflix
20 hours ago




@rv7 Thanks a lot! It's very helpful for me.
– Glufflix
20 hours ago












@Glufflix I updated my answer to retrieve both tags
– Nick Parsons
20 hours ago




@Glufflix I updated my answer to retrieve both tags
– Nick Parsons
20 hours ago












2 Answers
2






active

oldest

votes

















up vote
1
down vote













You can use the following regular expression to match the locus_tag:



/[locus_tag=(w+)]/g;



In this expression, I have captured word characters after the "locus_tag=" and so you can access it by doing .exec(str)[1] twice to get both of the tags.



See a working example below:






const str = 
`lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS1477705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG

lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA`;

const regex = /[locus_tag=(w+)]/g;
console.log(regex.exec(str)[1]); // Run exec once to get the first match
console.log(regex.exec(str)[1]); // Run exec twice to get the second match








share|improve this answer






























    up vote
    0
    down vote













    You can also try any of the following approaches.




    Here I've assumed your locus tag has word characters as I can see. And w+ is there to match it.



    Helpful link: https://javascript.info/regexp-groups




    1st way



    var s1 = "lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG";

    var s2 = "lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA";

    const regEx = /(locus_tag=(w+))/;

    var locus_tag1 = s1.match(regEx)[2];
    var locus_tag2 = s2.match(regEx)[2];

    console.log(locus_tag1); // LBPC_RS14705
    console.log(locus_tag2); // LBPC_RS14710


    2nd way



    var s1 = "lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG";

    var s2 = "lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA";

    const regEx = /(locus_tag=w+)/;

    var locus_tag1 = s1.match(regEx)[0].split('=')[1];
    var locus_tag2 = s2.match(regEx)[0].split('=')[1];

    console.log(locus_tag1); // LBPC_RS14705
    console.log(locus_tag2); // LBPC_RS14710





    share|improve this answer























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });






      Glufflix is a new contributor. Be nice, and check out our Code of Conduct.










       

      draft saved


      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53237041%2fhow-do-i-fix-edit-this-regular-expression%23new-answer', 'question_page');
      }
      );

      Post as a guest
































      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      1
      down vote













      You can use the following regular expression to match the locus_tag:



      /[locus_tag=(w+)]/g;



      In this expression, I have captured word characters after the "locus_tag=" and so you can access it by doing .exec(str)[1] twice to get both of the tags.



      See a working example below:






      const str = 
      `lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS1477705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG

      lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA`;

      const regex = /[locus_tag=(w+)]/g;
      console.log(regex.exec(str)[1]); // Run exec once to get the first match
      console.log(regex.exec(str)[1]); // Run exec twice to get the second match








      share|improve this answer



























        up vote
        1
        down vote













        You can use the following regular expression to match the locus_tag:



        /[locus_tag=(w+)]/g;



        In this expression, I have captured word characters after the "locus_tag=" and so you can access it by doing .exec(str)[1] twice to get both of the tags.



        See a working example below:






        const str = 
        `lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS1477705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG

        lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA`;

        const regex = /[locus_tag=(w+)]/g;
        console.log(regex.exec(str)[1]); // Run exec once to get the first match
        console.log(regex.exec(str)[1]); // Run exec twice to get the second match








        share|improve this answer

























          up vote
          1
          down vote










          up vote
          1
          down vote









          You can use the following regular expression to match the locus_tag:



          /[locus_tag=(w+)]/g;



          In this expression, I have captured word characters after the "locus_tag=" and so you can access it by doing .exec(str)[1] twice to get both of the tags.



          See a working example below:






          const str = 
          `lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS1477705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG

          lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA`;

          const regex = /[locus_tag=(w+)]/g;
          console.log(regex.exec(str)[1]); // Run exec once to get the first match
          console.log(regex.exec(str)[1]); // Run exec twice to get the second match








          share|improve this answer














          You can use the following regular expression to match the locus_tag:



          /[locus_tag=(w+)]/g;



          In this expression, I have captured word characters after the "locus_tag=" and so you can access it by doing .exec(str)[1] twice to get both of the tags.



          See a working example below:






          const str = 
          `lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS1477705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG

          lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA`;

          const regex = /[locus_tag=(w+)]/g;
          console.log(regex.exec(str)[1]); // Run exec once to get the first match
          console.log(regex.exec(str)[1]); // Run exec twice to get the second match








          const str = 
          `lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS1477705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG

          lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA`;

          const regex = /[locus_tag=(w+)]/g;
          console.log(regex.exec(str)[1]); // Run exec once to get the first match
          console.log(regex.exec(str)[1]); // Run exec twice to get the second match





          const str = 
          `lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS1477705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG

          lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA`;

          const regex = /[locus_tag=(w+)]/g;
          console.log(regex.exec(str)[1]); // Run exec once to get the first match
          console.log(regex.exec(str)[1]); // Run exec twice to get the second match






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 18 hours ago

























          answered 21 hours ago









          Nick Parsons

          2,0682518




          2,0682518
























              up vote
              0
              down vote













              You can also try any of the following approaches.




              Here I've assumed your locus tag has word characters as I can see. And w+ is there to match it.



              Helpful link: https://javascript.info/regexp-groups




              1st way



              var s1 = "lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG";

              var s2 = "lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA";

              const regEx = /(locus_tag=(w+))/;

              var locus_tag1 = s1.match(regEx)[2];
              var locus_tag2 = s2.match(regEx)[2];

              console.log(locus_tag1); // LBPC_RS14705
              console.log(locus_tag2); // LBPC_RS14710


              2nd way



              var s1 = "lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG";

              var s2 = "lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA";

              const regEx = /(locus_tag=w+)/;

              var locus_tag1 = s1.match(regEx)[0].split('=')[1];
              var locus_tag2 = s2.match(regEx)[0].split('=')[1];

              console.log(locus_tag1); // LBPC_RS14705
              console.log(locus_tag2); // LBPC_RS14710





              share|improve this answer



























                up vote
                0
                down vote













                You can also try any of the following approaches.




                Here I've assumed your locus tag has word characters as I can see. And w+ is there to match it.



                Helpful link: https://javascript.info/regexp-groups




                1st way



                var s1 = "lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG";

                var s2 = "lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA";

                const regEx = /(locus_tag=(w+))/;

                var locus_tag1 = s1.match(regEx)[2];
                var locus_tag2 = s2.match(regEx)[2];

                console.log(locus_tag1); // LBPC_RS14705
                console.log(locus_tag2); // LBPC_RS14710


                2nd way



                var s1 = "lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG";

                var s2 = "lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA";

                const regEx = /(locus_tag=w+)/;

                var locus_tag1 = s1.match(regEx)[0].split('=')[1];
                var locus_tag2 = s2.match(regEx)[0].split('=')[1];

                console.log(locus_tag1); // LBPC_RS14705
                console.log(locus_tag2); // LBPC_RS14710





                share|improve this answer

























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  You can also try any of the following approaches.




                  Here I've assumed your locus tag has word characters as I can see. And w+ is there to match it.



                  Helpful link: https://javascript.info/regexp-groups




                  1st way



                  var s1 = "lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG";

                  var s2 = "lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA";

                  const regEx = /(locus_tag=(w+))/;

                  var locus_tag1 = s1.match(regEx)[2];
                  var locus_tag2 = s2.match(regEx)[2];

                  console.log(locus_tag1); // LBPC_RS14705
                  console.log(locus_tag2); // LBPC_RS14710


                  2nd way



                  var s1 = "lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG";

                  var s2 = "lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA";

                  const regEx = /(locus_tag=w+)/;

                  var locus_tag1 = s1.match(regEx)[0].split('=')[1];
                  var locus_tag2 = s2.match(regEx)[0].split('=')[1];

                  console.log(locus_tag1); // LBPC_RS14705
                  console.log(locus_tag2); // LBPC_RS14710





                  share|improve this answer














                  You can also try any of the following approaches.




                  Here I've assumed your locus tag has word characters as I can see. And w+ is there to match it.



                  Helpful link: https://javascript.info/regexp-groups




                  1st way



                  var s1 = "lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG";

                  var s2 = "lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA";

                  const regEx = /(locus_tag=(w+))/;

                  var locus_tag1 = s1.match(regEx)[2];
                  var locus_tag2 = s2.match(regEx)[2];

                  console.log(locus_tag1); // LBPC_RS14705
                  console.log(locus_tag2); // LBPC_RS14710


                  2nd way



                  var s1 = "lcl|NZ_AP012542.1_cds_WP_003600377.1_1 [locus_tag=LBPC_RS14705] [db_xref=GeneID:31583580] [protein=RepB family plasmid replication initiator protein] [protein_id=WP_003600377.1] [location=1..780] [gbkey=CDS] ATGGCAAATACAATCAACAAAAAACAAAATCTGGCGATGCAGGCGTTGCTTAAACGCCAAGACTATCTTG";

                  var s2 = "lcl|NZ_AP012542.1_cds_WP_016377574.1_2 [locus_tag=LBPC_RS14710] [db_xref=GeneID:31583581] [protein=DUF536 domain-containing protein] [protein_id=WP_016377574.1] [location=complement(1459..1956)] [gbkey=CDS] ATGAGTAAGACCATCAAAGAACTTGCAGAGGAATTGAGCTTATCTAAATCTGGTATTCGTAAATATCTAA";

                  const regEx = /(locus_tag=w+)/;

                  var locus_tag1 = s1.match(regEx)[0].split('=')[1];
                  var locus_tag2 = s2.match(regEx)[0].split('=')[1];

                  console.log(locus_tag1); // LBPC_RS14705
                  console.log(locus_tag2); // LBPC_RS14710






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 18 hours ago

























                  answered 18 hours ago









                  hygull

                  2,67311126




                  2,67311126






















                      Glufflix is a new contributor. Be nice, and check out our Code of Conduct.










                       

                      draft saved


                      draft discarded


















                      Glufflix is a new contributor. Be nice, and check out our Code of Conduct.













                      Glufflix is a new contributor. Be nice, and check out our Code of Conduct.












                      Glufflix is a new contributor. Be nice, and check out our Code of Conduct.















                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53237041%2fhow-do-i-fix-edit-this-regular-expression%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest




















































































                      Popular posts from this blog

                      Bressuire

                      Vorschmack

                      Quarantine