How we can find domain name using MySQL and regular expression












6














i am having some list of domains in the DB,like



http://www.masn.com/index.html
http://www.123musiq.com/index.html etc



what i need as out put is



http://www.masn.com
http://www.123musiq.com



how can i do that in regular expression???










share|improve this question





























    6














    i am having some list of domains in the DB,like



    http://www.masn.com/index.html
    http://www.123musiq.com/index.html etc



    what i need as out put is



    http://www.masn.com
    http://www.123musiq.com



    how can i do that in regular expression???










    share|improve this question



























      6












      6








      6







      i am having some list of domains in the DB,like



      http://www.masn.com/index.html
      http://www.123musiq.com/index.html etc



      what i need as out put is



      http://www.masn.com
      http://www.123musiq.com



      how can i do that in regular expression???










      share|improve this question















      i am having some list of domains in the DB,like



      http://www.masn.com/index.html
      http://www.123musiq.com/index.html etc



      what i need as out put is



      http://www.masn.com
      http://www.123musiq.com



      how can i do that in regular expression???







      mysql






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Aug 19 '10 at 12:08

























      asked Aug 19 '10 at 11:06









      Alex Mathew

      64872756




      64872756
























          5 Answers
          5






          active

          oldest

          votes


















          9














          In MySQL, regular expressions can match but not return substrings.



          You can use SUBSTRING_INDEX:



          SELECT  SUBSTRING_INDEX('www.example.com', '/', 1)


          , however, it's not protocol prefix safe.



          If you are using a mix of prefixed and unprefixed URL's, use this:



          SELECT  url RLIKE '^http://',
          CASE
          WHEN url RLIKE '^http://' THEN
          SUBSTRING_INDEX(SUBSTRING_INDEX(url, '/', 3), '/', -1)
          ELSE
          SUBSTRING_INDEX(url, '/', 1)
          END
          FROM (
          SELECT 'www.example.com/test/test' AS url
          UNION ALL
          SELECT 'http://www.example.com/test'
          ) q





          share|improve this answer



















          • 1




            it is just returning http:
            – Alex Mathew
            Aug 19 '10 at 12:07










          • @Alex: as I already said in the answer, it's not protocol prefix safe.
            – Quassnoi
            Aug 19 '10 at 12:11



















          4














          use substring_index



          http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_substring-index



          like



          SELECT  SUBSTRING_INDEX(urlfield, '/', 1) from mytable





          share|improve this answer





























            1














            SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('http://www.domain.com/', '://', -1),'/', 1);



            Result: www.domain.com



            SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX('http://www.domain.com/', '://', -1),'/',1),'www.', -1);



            Result: domain.com






            share|improve this answer





























              0














              Based on these answers, I came up with a similar solution, but it requires multiple queries.



              SELECT SUBSTRING_INDEX(url,'/',1) FROM table WHERE url NOT REGEXP '^[^:]+://';
              SELECT SUBSTRING_INDEX(url,'/',3) FROM table WHERE url REGEXP '^[^:]+://';


              The first query handles URLs without a protocol prefix. The second query handles URLs with a protocol prefix. Please note that these do not handle every valid URL, but should handle most proper URLs.






              share|improve this answer





























                0














                If you're not afraid of installing MySQL extensions (UDFs), then there's a UDF you can use that does exactly this while respecting different top-level domains like "google.com" and "google.co.uk", and handles a whole ton of other edge cases



                https://github.com/StirlingMarketingGroup/mysql-get-etld-p1



                select`get_etld_p1`('http://a.very.complex-domain.co.uk:8080/foo/bar');-- 'complex-domain.co.uk'
                select`get_etld_p1`('https://www.bbc.co.uk/');-- 'bbc.co.uk'
                select`get_etld_p1`('https://github.com/StirlingMarketingGroup/');-- 'github.com'
                select`get_etld_p1`('https://localhost:10000/index');-- 'localhost'
                select`get_etld_p1`('android-app://com.google.android.gm');-- 'com.google.android.gm'
                select`get_etld_p1`('example.test.domain.com');-- 'domain.com'
                select`get_etld_p1`('postgres://user:pass@host.com:5432/path?k=v#f');-- 'host.com'
                select`get_etld_p1`('exzvk.omsk.so-ups.ru');-- 'so-ups.ru'
                select`get_etld_p1`('http://10.64.3.5/data_check/index.php?r=index/rawdatacheck');-- '10.64.3.5'
                select`get_etld_p1`('not a domain');-- null





                share|improve this answer





















                  Your Answer






                  StackExchange.ifUsing("editor", function () {
                  StackExchange.using("externalEditor", function () {
                  StackExchange.using("snippets", function () {
                  StackExchange.snippets.init();
                  });
                  });
                  }, "code-snippets");

                  StackExchange.ready(function() {
                  var channelOptions = {
                  tags: "".split(" "),
                  id: "1"
                  };
                  initTagRenderer("".split(" "), "".split(" "), channelOptions);

                  StackExchange.using("externalEditor", function() {
                  // Have to fire editor after snippets, if snippets enabled
                  if (StackExchange.settings.snippets.snippetsEnabled) {
                  StackExchange.using("snippets", function() {
                  createEditor();
                  });
                  }
                  else {
                  createEditor();
                  }
                  });

                  function createEditor() {
                  StackExchange.prepareEditor({
                  heartbeatType: 'answer',
                  autoActivateHeartbeat: false,
                  convertImagesToLinks: true,
                  noModals: true,
                  showLowRepImageUploadWarning: true,
                  reputationToPostImages: 10,
                  bindNavPrevention: true,
                  postfix: "",
                  imageUploader: {
                  brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                  contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                  allowUrls: true
                  },
                  onDemand: true,
                  discardSelector: ".discard-answer"
                  ,immediatelyShowMarkdownHelp:true
                  });


                  }
                  });














                  draft saved

                  draft discarded


















                  StackExchange.ready(
                  function () {
                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f3521094%2fhow-we-can-find-domain-name-using-mysql-and-regular-expression%23new-answer', 'question_page');
                  }
                  );

                  Post as a guest















                  Required, but never shown

























                  5 Answers
                  5






                  active

                  oldest

                  votes








                  5 Answers
                  5






                  active

                  oldest

                  votes









                  active

                  oldest

                  votes






                  active

                  oldest

                  votes









                  9














                  In MySQL, regular expressions can match but not return substrings.



                  You can use SUBSTRING_INDEX:



                  SELECT  SUBSTRING_INDEX('www.example.com', '/', 1)


                  , however, it's not protocol prefix safe.



                  If you are using a mix of prefixed and unprefixed URL's, use this:



                  SELECT  url RLIKE '^http://',
                  CASE
                  WHEN url RLIKE '^http://' THEN
                  SUBSTRING_INDEX(SUBSTRING_INDEX(url, '/', 3), '/', -1)
                  ELSE
                  SUBSTRING_INDEX(url, '/', 1)
                  END
                  FROM (
                  SELECT 'www.example.com/test/test' AS url
                  UNION ALL
                  SELECT 'http://www.example.com/test'
                  ) q





                  share|improve this answer



















                  • 1




                    it is just returning http:
                    – Alex Mathew
                    Aug 19 '10 at 12:07










                  • @Alex: as I already said in the answer, it's not protocol prefix safe.
                    – Quassnoi
                    Aug 19 '10 at 12:11
















                  9














                  In MySQL, regular expressions can match but not return substrings.



                  You can use SUBSTRING_INDEX:



                  SELECT  SUBSTRING_INDEX('www.example.com', '/', 1)


                  , however, it's not protocol prefix safe.



                  If you are using a mix of prefixed and unprefixed URL's, use this:



                  SELECT  url RLIKE '^http://',
                  CASE
                  WHEN url RLIKE '^http://' THEN
                  SUBSTRING_INDEX(SUBSTRING_INDEX(url, '/', 3), '/', -1)
                  ELSE
                  SUBSTRING_INDEX(url, '/', 1)
                  END
                  FROM (
                  SELECT 'www.example.com/test/test' AS url
                  UNION ALL
                  SELECT 'http://www.example.com/test'
                  ) q





                  share|improve this answer



















                  • 1




                    it is just returning http:
                    – Alex Mathew
                    Aug 19 '10 at 12:07










                  • @Alex: as I already said in the answer, it's not protocol prefix safe.
                    – Quassnoi
                    Aug 19 '10 at 12:11














                  9












                  9








                  9






                  In MySQL, regular expressions can match but not return substrings.



                  You can use SUBSTRING_INDEX:



                  SELECT  SUBSTRING_INDEX('www.example.com', '/', 1)


                  , however, it's not protocol prefix safe.



                  If you are using a mix of prefixed and unprefixed URL's, use this:



                  SELECT  url RLIKE '^http://',
                  CASE
                  WHEN url RLIKE '^http://' THEN
                  SUBSTRING_INDEX(SUBSTRING_INDEX(url, '/', 3), '/', -1)
                  ELSE
                  SUBSTRING_INDEX(url, '/', 1)
                  END
                  FROM (
                  SELECT 'www.example.com/test/test' AS url
                  UNION ALL
                  SELECT 'http://www.example.com/test'
                  ) q





                  share|improve this answer














                  In MySQL, regular expressions can match but not return substrings.



                  You can use SUBSTRING_INDEX:



                  SELECT  SUBSTRING_INDEX('www.example.com', '/', 1)


                  , however, it's not protocol prefix safe.



                  If you are using a mix of prefixed and unprefixed URL's, use this:



                  SELECT  url RLIKE '^http://',
                  CASE
                  WHEN url RLIKE '^http://' THEN
                  SUBSTRING_INDEX(SUBSTRING_INDEX(url, '/', 3), '/', -1)
                  ELSE
                  SUBSTRING_INDEX(url, '/', 1)
                  END
                  FROM (
                  SELECT 'www.example.com/test/test' AS url
                  UNION ALL
                  SELECT 'http://www.example.com/test'
                  ) q






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Aug 19 '10 at 12:16

























                  answered Aug 19 '10 at 11:12









                  Quassnoi

                  328k69516556




                  328k69516556








                  • 1




                    it is just returning http:
                    – Alex Mathew
                    Aug 19 '10 at 12:07










                  • @Alex: as I already said in the answer, it's not protocol prefix safe.
                    – Quassnoi
                    Aug 19 '10 at 12:11














                  • 1




                    it is just returning http:
                    – Alex Mathew
                    Aug 19 '10 at 12:07










                  • @Alex: as I already said in the answer, it's not protocol prefix safe.
                    – Quassnoi
                    Aug 19 '10 at 12:11








                  1




                  1




                  it is just returning http:
                  – Alex Mathew
                  Aug 19 '10 at 12:07




                  it is just returning http:
                  – Alex Mathew
                  Aug 19 '10 at 12:07












                  @Alex: as I already said in the answer, it's not protocol prefix safe.
                  – Quassnoi
                  Aug 19 '10 at 12:11




                  @Alex: as I already said in the answer, it's not protocol prefix safe.
                  – Quassnoi
                  Aug 19 '10 at 12:11













                  4














                  use substring_index



                  http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_substring-index



                  like



                  SELECT  SUBSTRING_INDEX(urlfield, '/', 1) from mytable





                  share|improve this answer


























                    4














                    use substring_index



                    http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_substring-index



                    like



                    SELECT  SUBSTRING_INDEX(urlfield, '/', 1) from mytable





                    share|improve this answer
























                      4












                      4








                      4






                      use substring_index



                      http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_substring-index



                      like



                      SELECT  SUBSTRING_INDEX(urlfield, '/', 1) from mytable





                      share|improve this answer












                      use substring_index



                      http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_substring-index



                      like



                      SELECT  SUBSTRING_INDEX(urlfield, '/', 1) from mytable






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Aug 19 '10 at 11:12









                      Haim Evgi

                      90.9k33185206




                      90.9k33185206























                          1














                          SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('http://www.domain.com/', '://', -1),'/', 1);



                          Result: www.domain.com



                          SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX('http://www.domain.com/', '://', -1),'/',1),'www.', -1);



                          Result: domain.com






                          share|improve this answer


























                            1














                            SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('http://www.domain.com/', '://', -1),'/', 1);



                            Result: www.domain.com



                            SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX('http://www.domain.com/', '://', -1),'/',1),'www.', -1);



                            Result: domain.com






                            share|improve this answer
























                              1












                              1








                              1






                              SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('http://www.domain.com/', '://', -1),'/', 1);



                              Result: www.domain.com



                              SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX('http://www.domain.com/', '://', -1),'/',1),'www.', -1);



                              Result: domain.com






                              share|improve this answer












                              SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('http://www.domain.com/', '://', -1),'/', 1);



                              Result: www.domain.com



                              SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX('http://www.domain.com/', '://', -1),'/',1),'www.', -1);



                              Result: domain.com







                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered Mar 8 '13 at 14:10









                              Alexander K.

                              111




                              111























                                  0














                                  Based on these answers, I came up with a similar solution, but it requires multiple queries.



                                  SELECT SUBSTRING_INDEX(url,'/',1) FROM table WHERE url NOT REGEXP '^[^:]+://';
                                  SELECT SUBSTRING_INDEX(url,'/',3) FROM table WHERE url REGEXP '^[^:]+://';


                                  The first query handles URLs without a protocol prefix. The second query handles URLs with a protocol prefix. Please note that these do not handle every valid URL, but should handle most proper URLs.






                                  share|improve this answer


























                                    0














                                    Based on these answers, I came up with a similar solution, but it requires multiple queries.



                                    SELECT SUBSTRING_INDEX(url,'/',1) FROM table WHERE url NOT REGEXP '^[^:]+://';
                                    SELECT SUBSTRING_INDEX(url,'/',3) FROM table WHERE url REGEXP '^[^:]+://';


                                    The first query handles URLs without a protocol prefix. The second query handles URLs with a protocol prefix. Please note that these do not handle every valid URL, but should handle most proper URLs.






                                    share|improve this answer
























                                      0












                                      0








                                      0






                                      Based on these answers, I came up with a similar solution, but it requires multiple queries.



                                      SELECT SUBSTRING_INDEX(url,'/',1) FROM table WHERE url NOT REGEXP '^[^:]+://';
                                      SELECT SUBSTRING_INDEX(url,'/',3) FROM table WHERE url REGEXP '^[^:]+://';


                                      The first query handles URLs without a protocol prefix. The second query handles URLs with a protocol prefix. Please note that these do not handle every valid URL, but should handle most proper URLs.






                                      share|improve this answer












                                      Based on these answers, I came up with a similar solution, but it requires multiple queries.



                                      SELECT SUBSTRING_INDEX(url,'/',1) FROM table WHERE url NOT REGEXP '^[^:]+://';
                                      SELECT SUBSTRING_INDEX(url,'/',3) FROM table WHERE url REGEXP '^[^:]+://';


                                      The first query handles URLs without a protocol prefix. The second query handles URLs with a protocol prefix. Please note that these do not handle every valid URL, but should handle most proper URLs.







                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Feb 17 '12 at 23:30









                                      jimp

                                      13.2k32134




                                      13.2k32134























                                          0














                                          If you're not afraid of installing MySQL extensions (UDFs), then there's a UDF you can use that does exactly this while respecting different top-level domains like "google.com" and "google.co.uk", and handles a whole ton of other edge cases



                                          https://github.com/StirlingMarketingGroup/mysql-get-etld-p1



                                          select`get_etld_p1`('http://a.very.complex-domain.co.uk:8080/foo/bar');-- 'complex-domain.co.uk'
                                          select`get_etld_p1`('https://www.bbc.co.uk/');-- 'bbc.co.uk'
                                          select`get_etld_p1`('https://github.com/StirlingMarketingGroup/');-- 'github.com'
                                          select`get_etld_p1`('https://localhost:10000/index');-- 'localhost'
                                          select`get_etld_p1`('android-app://com.google.android.gm');-- 'com.google.android.gm'
                                          select`get_etld_p1`('example.test.domain.com');-- 'domain.com'
                                          select`get_etld_p1`('postgres://user:pass@host.com:5432/path?k=v#f');-- 'host.com'
                                          select`get_etld_p1`('exzvk.omsk.so-ups.ru');-- 'so-ups.ru'
                                          select`get_etld_p1`('http://10.64.3.5/data_check/index.php?r=index/rawdatacheck');-- '10.64.3.5'
                                          select`get_etld_p1`('not a domain');-- null





                                          share|improve this answer


























                                            0














                                            If you're not afraid of installing MySQL extensions (UDFs), then there's a UDF you can use that does exactly this while respecting different top-level domains like "google.com" and "google.co.uk", and handles a whole ton of other edge cases



                                            https://github.com/StirlingMarketingGroup/mysql-get-etld-p1



                                            select`get_etld_p1`('http://a.very.complex-domain.co.uk:8080/foo/bar');-- 'complex-domain.co.uk'
                                            select`get_etld_p1`('https://www.bbc.co.uk/');-- 'bbc.co.uk'
                                            select`get_etld_p1`('https://github.com/StirlingMarketingGroup/');-- 'github.com'
                                            select`get_etld_p1`('https://localhost:10000/index');-- 'localhost'
                                            select`get_etld_p1`('android-app://com.google.android.gm');-- 'com.google.android.gm'
                                            select`get_etld_p1`('example.test.domain.com');-- 'domain.com'
                                            select`get_etld_p1`('postgres://user:pass@host.com:5432/path?k=v#f');-- 'host.com'
                                            select`get_etld_p1`('exzvk.omsk.so-ups.ru');-- 'so-ups.ru'
                                            select`get_etld_p1`('http://10.64.3.5/data_check/index.php?r=index/rawdatacheck');-- '10.64.3.5'
                                            select`get_etld_p1`('not a domain');-- null





                                            share|improve this answer
























                                              0












                                              0








                                              0






                                              If you're not afraid of installing MySQL extensions (UDFs), then there's a UDF you can use that does exactly this while respecting different top-level domains like "google.com" and "google.co.uk", and handles a whole ton of other edge cases



                                              https://github.com/StirlingMarketingGroup/mysql-get-etld-p1



                                              select`get_etld_p1`('http://a.very.complex-domain.co.uk:8080/foo/bar');-- 'complex-domain.co.uk'
                                              select`get_etld_p1`('https://www.bbc.co.uk/');-- 'bbc.co.uk'
                                              select`get_etld_p1`('https://github.com/StirlingMarketingGroup/');-- 'github.com'
                                              select`get_etld_p1`('https://localhost:10000/index');-- 'localhost'
                                              select`get_etld_p1`('android-app://com.google.android.gm');-- 'com.google.android.gm'
                                              select`get_etld_p1`('example.test.domain.com');-- 'domain.com'
                                              select`get_etld_p1`('postgres://user:pass@host.com:5432/path?k=v#f');-- 'host.com'
                                              select`get_etld_p1`('exzvk.omsk.so-ups.ru');-- 'so-ups.ru'
                                              select`get_etld_p1`('http://10.64.3.5/data_check/index.php?r=index/rawdatacheck');-- '10.64.3.5'
                                              select`get_etld_p1`('not a domain');-- null





                                              share|improve this answer












                                              If you're not afraid of installing MySQL extensions (UDFs), then there's a UDF you can use that does exactly this while respecting different top-level domains like "google.com" and "google.co.uk", and handles a whole ton of other edge cases



                                              https://github.com/StirlingMarketingGroup/mysql-get-etld-p1



                                              select`get_etld_p1`('http://a.very.complex-domain.co.uk:8080/foo/bar');-- 'complex-domain.co.uk'
                                              select`get_etld_p1`('https://www.bbc.co.uk/');-- 'bbc.co.uk'
                                              select`get_etld_p1`('https://github.com/StirlingMarketingGroup/');-- 'github.com'
                                              select`get_etld_p1`('https://localhost:10000/index');-- 'localhost'
                                              select`get_etld_p1`('android-app://com.google.android.gm');-- 'com.google.android.gm'
                                              select`get_etld_p1`('example.test.domain.com');-- 'domain.com'
                                              select`get_etld_p1`('postgres://user:pass@host.com:5432/path?k=v#f');-- 'host.com'
                                              select`get_etld_p1`('exzvk.omsk.so-ups.ru');-- 'so-ups.ru'
                                              select`get_etld_p1`('http://10.64.3.5/data_check/index.php?r=index/rawdatacheck');-- '10.64.3.5'
                                              select`get_etld_p1`('not a domain');-- null






                                              share|improve this answer












                                              share|improve this answer



                                              share|improve this answer










                                              answered Nov 12 '18 at 23:32









                                              Brian Leishman

                                              2,60763462




                                              2,60763462






























                                                  draft saved

                                                  draft discarded




















































                                                  Thanks for contributing an answer to Stack Overflow!


                                                  • Please be sure to answer the question. Provide details and share your research!

                                                  But avoid



                                                  • Asking for help, clarification, or responding to other answers.

                                                  • Making statements based on opinion; back them up with references or personal experience.


                                                  To learn more, see our tips on writing great answers.





                                                  Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                                  Please pay close attention to the following guidance:


                                                  • Please be sure to answer the question. Provide details and share your research!

                                                  But avoid



                                                  • Asking for help, clarification, or responding to other answers.

                                                  • Making statements based on opinion; back them up with references or personal experience.


                                                  To learn more, see our tips on writing great answers.




                                                  draft saved


                                                  draft discarded














                                                  StackExchange.ready(
                                                  function () {
                                                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f3521094%2fhow-we-can-find-domain-name-using-mysql-and-regular-expression%23new-answer', 'question_page');
                                                  }
                                                  );

                                                  Post as a guest















                                                  Required, but never shown





















































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown

































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown







                                                  Popular posts from this blog

                                                  List item for chat from Array inside array React Native

                                                  Thiostrepton

                                                  Caerphilly